Archive

Archive for the ‘Benchmarks’ Category

Collections Performance Benchmarks

March 25th, 2008 deevis 1 comment

I’ve been having fun of late pitting some Collections implementations against each other and getting some pretty peculiar results coming back. Well, I guess the only thing peculiar about them is that Vector keeps out performing ArrayList, even though Vector is synchronized and ArrayList in theory should be faster. So, one way that ArrayList can lose is when the two aren’t sized correctly upon construction and have to grow to accommodate their datasets. ArrayList grows by 50% with each resize, while Vector doubles. Having to resize is an expensive operation and this explains quite a bit. But even when I remove the resizing ( by sizing correctly upon construction ) ArrayList still gets pwned by Vector? Here’s the results from a test with random 40-character Strings being built into a unique Collection where the Collections are sized Collections Showdown - presized.correctly. Notice how Vector and LinkedList perform virtually exactly the same, but ArrayList is slower, and getting slower as the dataset size increases. Of course, HashSet is still the hands down winner and what you should use when working with Collections that don’t care about ordering. TreeSet is a close second, but bear in mind that TreeSet can perform poorly when your dataset have similar Strings.

So, now it’s time to dig a bit deeper into the entire ArrayList versus Vector thing. What if I build each of them up with integers from 1 to x before the test runs, and then as the test call contains() on each integer 1 to x? Is contains faster or slower for ArrayList?

ArrayList vs Vector - add

So, it looks like the calls to contains() are slower for ArrayList. What about the calls to add() with a presized Collection?

calling_add_x_number_of_times__presized_collections__2008_03_23__09_33_pm.png

Yet again, Vector is a little bit faster than ArrayList! I’ve looked through the Java source code and so help me I swear the y are doing the same effective work – but Vector is synchronized and SHOULD BE SLOWER!

So, now I need to branch out and do some testing on other JDK’s and other OS’s. 1.5 and Linux, here I come!

Categories: Benchmarks, JDK Tags:

The cost of Autoboxing

March 15th, 2008 deevis No comments

Simple question: How expensive is autoboxing of int/Integer types?
Simple answer: 15 nanoseconds per boxing.

No autoboxing ( ints only )
[source:java]
public void runInternalTrial() throws Exception {
int i = getRandomWithoutAutobox();
}

public int getRandomWithoutAutobox() {
return rnd.nextInt(1000);
}
[/source]
Autoboxing
[source:java]
public void runInternalTrial() throws Exception {
int i = getRandom();
}

// Will Autobox an int to an Integer…
public Integer getRandom() {
return rnd.nextInt(1000);
}
[/source]
Note that the Autoboxing example has to Autobox the value returned by getRandom to an Integer and then the caller has to then unbox this back to an int primitive. The two examples for HalfBoxing are simply Autoboxing and then not unboxing back to an int. Here are the results:
java_autoboxing_2008_03_12__02_07_am.png

Categories: Benchmarks, Uncategorized Tags:

Building a unique collection of strings

March 4th, 2008 deevis No comments

I recently encountered some Java code that was building a unique list of String objects in order to prevent multiple processing of any of them. The code was using an ArrayList and would make a call to contains() to determine whether or not to add each String. I knew this wasn’t the optimal way of doing it and wanted to immediately change the implementation to use HashSet, but figured I’d benchmark it and learn exactly what I’m dealing with. In addition to benchmarking the ArrayList implementation and a HashSet implementation, I also threw a TreeSet and the new java.util.concurrent.ConcurrentSkipListSet into the mix. What do you think? Which will be the fastest between HashSet, TreeSet and ArrayList implementations?

ArrayList

[source:java]
public void runInternalTrial() throws Exception {
List list = new ArrayList();
int count = 0;
while ( count < getWorkload() ) {
for ( String s : data ) {
if ( ! list.contains( s )) {
list.add( s );
}
count++;
if ( count >= getWorkload() ) break;
}
}
}

[/source]

HashSet

[source:java]
public void runInternalTrial() throws Exception {
Set set = new HashSet();
int count = 0;
while ( count < getWorkload() ) {
for ( String s : data ) {
set.add( s );
count++;
if ( count >= getWorkload() ) break;
}
}
}

[/source]

Here are the results: ( on the time axis I’ve got the size of the data being processed )

Building a unique set of Strings

Of interest from the results shown above is that the ArrayList implementation is actually faster for the extremely small data sizes of 5 & 10. I wasn’t exactly sure how the TreeSet would perform, but I thought it might out perform the ArrayList. I guess this is generally going to be dependent on the nature of the data being processed. HashSet, sure enough, is the hands down winner. Even with the small sized datasets where ArrayList wins, the amounts are down in the hundreds of nanoseconds realm. Once the size of the dataset hits ( still very small ) size 15, then HashSet is already 1,500 nanoseconds faster.

The new ConcurrentSkipListSet, while it is completely blown away performance-wise by even ArrayList and TreeSet, shouldn’t necessarily be counted out. If the building of the unique list isn’t something that happens too often, then even the cost of an extra 30,000 nanoseconds isn’t going to be noticable. If you are in your service layer and are using the ConcurrentSkipListSet to maintain state that may be accessed/mutated by multiple threads, the guaranteed thread safety may well be worth this extra speed cost.

Categories: Benchmarks Tags:

Connection Pool Showdown

February 26th, 2008 deevis 3 comments

I put C3P0, DBCP, and Proxool to the test and confirmed that DBCP is the fastest pool in the West, so long as there isn’t any complicated synchronization being done by the code using DBCP! C3P0, safe even with complicated synchronization and very, very configurable and robutst, came in a pretty close second. Proxool, the loser in this test, was surprisingly unable to deal with more threads requesting Connections than it’s maxSize. It wouldn’t block, it would instantly throw an Exception. SO, I made sure that the pool sizes for all 3 were larger than the number of threads which would be concurrently accessing the pools.

C3P0, DBCP, and Proxool battle it out

Connection Pool Showdown

JVM Version 1.6.0_05-ea
Windows XP[x86] – 2 processors
31.55MB/254.06MB Memory
Benchmark Threads Avg Nanos Slow Factor
DBCP 40 74092.28 1.00x
DBCP 30 75085.88 1.01x
DBCP 20 75230.27 1.02x
DBCP 10 76437.40 1.03x
C3P0 10 93405.98 1.26x
C3P0 30 94054.79 1.27x
C3P0 20 94087.60 1.27x
C3P0 40 94865.61 1.28x
DBCP 1 104907.12 1.42x
C3P0 1 116039.45 1.57x
Proxool 10 142769.39 1.93x
Proxool 20 143932.09 1.94x
Proxool 30 146066.93 1.97x
Proxool 40 148903.46 2.01x
Proxool 1 183513.91 2.48x
Categories: Benchmarks, Open Source Tags:

Log4J Conversion Pattern Performance

February 26th, 2008 deevis No comments

I’ve seen the warnings before in log4j’s documentation about the format directives %C, %F, %L, %l and %M, which give really useful information, but at a cost. Well, what is the cost of using some of these settings? What if I use them all, or none? I just had to find out, here are the results:

log4j_conversion_patterns_2008_03_12__02_35_am.png

These results completely reinforce the fact that the expensive settings %F, %L and %C should not be used in your production environment. In the unfortunate case that you use all of them, then you’re looking at ( already expensive ) logging calls taking 9 times longer, verging on millisecond timings.

Log4J Conversion Patterns

JVM Version 1.6.0_05-ea
Windows XP[x86] – 2 processors
5.46MB/63.56MB Memory

Benchmark Threads Avg Nanos Slow Factor
Fast 1 28825.21 1.00x
Fast 15 31416.34 1.09x
Fast 20 32638.25 1.13x
Fast 5 33509.15 1.16x
Fast 10 33564.08 1.16x
Fast 35 34246.23 1.19x
Fast 50 36421.31 1.26x
Fast 40 39088.55 1.36x
Fast 30 39572.70 1.37x
Fast 45 42566.97 1.48x
Fast 25 44669.52 1.55x
%F:%L 1 99596.84 3.46x
%C:%F:%L 1 109776.19 3.81x
%F:%L 15 120370.05 4.18x
%F:%L 20 121903.60 4.23x
%F:%L 5 122038.00 4.23x
%F:%L 25 125623.38 4.36x
%F:%L 35 126799.53 4.40x
%C:%F:%L 10 127100.32 4.41x
%F:%L 30 128148.09 4.45x
%F:%L 40 128208.69 4.45x
%C:%F:%L 20 131857.94 4.57x
%F:%L 50 133799.33 4.64x
%F:%L 10 133837.24 4.64x
%C:%F:%L 25 134114.84 4.65x
%F:%L 45 134969.59 4.68x
%C:%F:%L 35 135091.85 4.69x
%C:%F:%L 15 138423.94 4.80x
%C:%F:%L 30 139919.54 4.85x
%C:%F:%L 5 142347.28 4.94x
%C:%F:%L 45 144215.52 5.00x
%C:%F:%L 40 144652.90 5.02x
%C:%F:%L 50 155159.06 5.38x
ALL 45 260584.64 9.04x
ALL 15 260779.51 9.05x
ALL 40 263862.17 9.15x
ALL 1 264690.88 9.18x
ALL 25 266042.80 9.23x
ALL 35 267562.76 9.28x
ALL 5 270316.00 9.38x
ALL 50 272006.58 9.44x
ALL 10 276599.74 9.60x
ALL 30 277064.42 9.61x
ALL 20 286092.27 9.93x

But, then, how long does the individual usage of the perpetrators cost? I ran that one, too. Here are the results:
log4j_conversion_patterns_2008_03_15__12_25_pm.png

%l is the most expensive conversion pattern. On average it takes an extra 140,000 nanoseconds.
%C is the next most expensive, taking about 110,000 nanoseconds longer.
The rest ( %M, %F, %L ) each take about 100,000 nanoseconds longer.

Now you know.

Log4J Conversion Patterns

JVM Version 1.6.0_05-ea
Windows XP[x86] – 2 processors
4.94MB/63.56MB Memory
Benchmark Threads Avg Nanos Slow Factor
Fast 30 33106.79 1.00x
Fast 35 36082.72 1.09x
Fast 20 37435.39 1.13x
Fast 40 37615.85 1.14x
Fast 10 40022.65 1.21x
Fast 1 40961.19 1.24x
Fast 25 41009.21 1.24x
Fast 45 41631.96 1.26x
Fast 50 42989.00 1.30x
Fast 15 46050.36 1.39x
Fast 5 50155.79 1.51x
%M 5 126486.35 3.82x
%L 15 132137.75 3.99x
%L 35 134056.48 4.05x
%L 40 134538.77 4.06x
%M 15 135388.24 4.09x
%F 10 137914.76 4.17x
%L 5 138189.31 4.17x
%L 25 138781.50 4.19x
%F 15 139051.26 4.20x
%L 10 139648.04 4.22x
%F 25 139736.45 4.22x
%L 20 140881.53 4.26x
%M 10 140943.95 4.26x
%F 20 142361.85 4.30x
%M 20 142539.18 4.31x
%M 25 142656.80 4.31x
%C 20 143221.29 4.33x
%M 30 143781.90 4.34x
%F 5 143855.32 4.35x
%L 30 144129.23 4.35x
%M 40 144481.16 4.36x
%F 35 145803.89 4.40x
%F 30 145891.32 4.41x
%C 10 147103.22 4.44x
%F 45 148460.17 4.48x
%L 45 148951.05 4.50x
%C 25 149877.80 4.53x
%F 50 150231.08 4.54x
%C 30 150495.98 4.55x
%L 50 151044.66 4.56x
%C 5 151478.15 4.58x
%C 35 151516.71 4.58x
%M 45 152534.57 4.61x
%C 15 152969.01 4.62x
%M 50 152989.03 4.62x
%C 50 154476.76 4.67x
%F 40 155057.98 4.68x
%C 45 155471.11 4.70x
%M 35 155759.06 4.70x
%C 40 165737.16 5.01x
%l 10 170275.18 5.14x
%l 5 172227.11 5.20x
%l 20 172265.43 5.20x
%l 35 174530.94 5.27x
%l 50 175687.72 5.31x
%l 45 177227.30 5.35x
%l 15 177944.97 5.37x
%l 30 177987.91 5.38x
%l 25 181043.73 5.47x
%l 40 186023.07 5.62x
%F 1 203322.88 6.14x
%L 1 209034.65 6.31x
%C 1 218245.92 6.59x
%M 1 222148.81 6.71x
%l 1 223007.37 6.74x
Categories: Benchmarks, Open Source Tags:

Freemarker vs Raw Java

February 26th, 2008 deevis No comments

I’ve recently been pitting Freemarker against Raw Java of late because I was under the impression that Freemarker would blow up under load in a multi-threaded environment. Well, it turns out that things aren’t nearly as bad as I’d thought, but it’s still significantly slower than what I’ve been calling “Raw Java”. Here’s the final results with a pretty graph and everything.
freemarker_vs_raw_java_2008_03_12__02_02_am.png

Previous test results had shown Freemarker going into the 200 times slower category, but it does actually stay in the 20 times slower category. So, I’m not nearly as freaked out by this as I was a few weeks ago. I still think that the example I’m using, where the FTL is more of a program than a template, that the Raw Java method is nearly as appropriate and positively 20 times faster!

Freemarker vs Raw Java

JVM Version 1.6.0_05-ea
Windows XP[x86] – 2 processors
4.99MB/63.56MB Memory

Benchmark Threads Avg Nanos Slow Factor
Raw Java 50 1585.19 1.00x
Raw Java 30 1796.52 1.13x
Raw Java 20 1862.14 1.17x
Raw Java 10 1890.36 1.19x
Raw Java 60 1891.35 1.19x
Raw Java 40 1891.70 1.19x
Raw Java 1 2494.68 1.57x
Freemarker 10 26320.65 16.60x
Freemarker 20 27597.29 17.41x
Freemarker 30 28638.00 18.07x
Freemarker 40 29562.36 18.65x
Freemarker 50 30019.56 18.94x
Freemarker 60 31576.83 19.92x
Freemarker 1 34447.90 21.73x
Categories: Benchmarks, Open Source Tags:

MultiThreaded Freemarker Benchmarks

February 22nd, 2008 deevis No comments

I recently blogged about Freemarker performing 10-20x slower than pure Java analogues. I got to thinking about our webapp and the fact that I wasn’t quite testing the correct scenario. In a web environment there might be numerous threads all running Freemarker templates simultaneously. So, I had to put the pure Java analogues up against the simple Freemarker template in a multithreaded test. I ran tests for each with 10,20,40,60 and 80 threads. The results are more impetus towards moving away from Freemarker for the most used and most basic cases ( input tags, labels, script includes ).

So how bad is it…drum roll…

Class[10] indicates the pure Java method running a 10 Thread test. FTL[20] indicates the Freemarker test running a 20 Thread test. And so on… Results are listed from best to worst with the avg and max results in nanoseconds.

Class[10]: 1000100 trials, 18046 avg, 67973245 max 1.0x
Class[80]: 1000800 trials, 28960 avg, 104784259 max 1.6x
Class[20]: 1000200 trials, 30387 avg, 106770756 max 1.7x
Class[60]: 1000200 trials, 35917 avg, 96320112 max 2.0x
Class[40]: 1000400 trials, 37220 avg, 162898182 max 2.1x
FTL[10]: 1000100 trials, 293093 avg, 104114975 max 16.2x
FTL[20]: 1000200 trials, 587117 avg, 158432568 max 32.5x
FTL[40]: 1000400 trials, 1204641 avg, 190966244 max 66.8x
FTL[60]: 1000200 trials, 1843190 avg, 284686324 max 102.1x
FTL[80]: 1000800 trials, 2498693 avg, 358187778 max 138.5x

The last piece of info on each line is how much slower that particular test was ( on average ) than the fastest. Notice how the pure Java tests don’t jump right into the crapper when under load. But the FTL tests get progressively worse as the thread counts are increased. Under an 80 thread load, not only is the pure Java way 138 times faster on average, but the FTL way has max time of 358 milliseconds vs a 104 millisecond max for pure Java. This is a perceptible 250 millisecond difference when rendering a single button tag!

The case for easing up on the Freemarker templates is solidifying :)

Categories: Benchmarks, Open Source Tags:

Freemarker Benchmarks

February 19th, 2008 deevis No comments

All of us at iCentris have currently embarked upon a torrid love affair with Freemarker templates. I admit that I, too, have been very excited about having them around. A freemarker template has quickly become the knee-jerk, defacto response to generating output – which is great for readability and maintainability. But lately, after repeated profiling sessions with all the top offenders coming from freemarker packages, I’ve come to realize that we’re using them way too much. And here’s why…

When should a template engine be used? There seem to be two answers to this question which both carry equal weight with developers. They are:

  1. You want to have an easily editable version of your dynamic view. This will allow easy customization.
  2. You want the representation to be easily readable to future developers. This will keep it understandable and maintainable.

Now, this might seem like a minor point to pick at. But bear with me…

So, how much faster is pure java than freemarker templates? Well, this is going to depend largely on the complexity of the template. As an example, I’m going to pick a template of medium complexity that renders an HTML button tag in a consistent fashion. The output will be very simple and we’ll have a lot of HTML attributes that can be set within the template. Here’s the ftl file:

button.ftl

————————————–
[source:xml]
<@compress single_line=true>
<button <#if id?exists>
id=”${id}”
</#if>

<#if name?exists>
name=”${name}”
</#if>

<#if value?exists>
value=”${value}”
</#if>

<#if accesskey?exists>
accesskey=”${accesskey}”
</#if>

<#if classname?exists>
class=”${classname}”
</#if>

<#if style?exists>
style=”${style}”
</#if>

<#if title?exists>
title=”${title}”
</#if>

<#if href?exists || onclick?exists >
onclick=”
<#if href?exists>document.location.href=’${href}’;</#if>
<#if onclick?exists>${onclick}</#if>

</#if>

<#if ondblclick?exists>
ondblclick=”${ondblclick}”
</#if>

<#if onmouseout?exists>
onmouseout=”${onmouseout}”
</#if>

<#if type?exists>
type=”${type}”
<#else>
type=”button”
</#if>

<#if disabled?exists && disabled == ‘true’ >
disabled=”disabled”
</#if>
>
<div>
<div>
<#if icon?exists>${icon}</#if>

<span>
${content}
</span>
</div>
</div>

</button>
</@compress>
[/source]
It outputs something along the lines of :
[source:xml]
<button id=”buttonId” style=”buttonStyle” onclick=” alert(’hi’); ” type=”button” > <div> <div> <img src=’pretty.jpg’> <span> This is required </span> </div> </div> </button>
[/source]

Now, is this a good place to use a template like the above? The answer is Yes. But, as with most things, only in moderation. Or perhaps the answer is actually a definite maybe?

So, what can the downside be?

Speed.

Having just benchmarked this I have an unfair advantage, but the raw Java implementation that spits out the same exact templating logic presented above in button.ftl runs over 15 times faster.

Running 1000 trials of button.ftl (FTL) vs ButtonTemplate.java (Class) gives me the following results ( in microsecond timing ):

FTL 419033 micros
Class 31828 micros
Class is 13 times faster!

With only 100 trials:

FTL 15888 micros
Class 834 micros
Class is 19 times faster!

So, back to the moderation. Nobody is going to notice if a couple of buttons are rendered in 1 microSecond instead of 15 microSeconds, that’s for sure. But, if your entire page, overtime, has been rewritten to do absolutely everything using FTL, and the page ends up taking a second to load. Well, then, a 15x speed increase would get that page out in less than a tenth of a second.

I ran these tests with Freemarker 2.3.10 and then again with 2.3.12 and got the same results, so the optimization they added in 2.3.11 doesn’t seem to affect this type of simple usage ( using maps ).

Here is a screenshot showing the intense framework supporting the Freemarker template engine. Notice how the pure java class ( highlighted in blue ) just goes about its business directly.Freemarker Benchmark - 2.3.10 - Using FTL vs raw Java

So, back to the two reasons to use an FTL template.

  1. You want to have an easily editable version of your dynamic view.
  2. You want the representation to be easily readable to future developers.

Well, in regards to #1, I don’t think we really need to have an easily editable version of our <button> tag. I mean, really, a button’s a button’s a button. It might keep growing for a while, gathering more attributes and complexity over time, but I don’t think we’ll ever need to plug in a custom version for a different client.

As for #2, I don’t think the java code is really that much more difficult to read, and since we won’t be needing to change this (because it’s a button, for christ’s sake!) template that often, why take the speed hit here?

Here, btw, is the code for ButtonTemplate.java ( w/o Javabean boilerplate )
————————————————————————-
[source:Java]
public void render( Writer out ) throws IOException {
out.write( “<button” );
if ( id != null ) writeAttribute( out, “id”, id );
if ( name != null ) writeAttribute( out, “name”, name );
if ( value != null ) writeAttribute( out, “value”, value );
if ( accessKey != null ) writeAttribute( out, “accesskey”, accessKey );
if ( className != null ) writeAttribute( out, “class”, className );
if ( style != null ) writeAttribute( out, “style”, style );
if ( title != null ) writeAttribute( out, “title”, title );

if ( href != null || onClick != null ) {
out.write( ” onclick=\”" );
if ( href != null ) {
out.write( “document.location.href=’” );
out.write(href);
out.write(”‘;”);
}
if ( onClick != null ) {
out.write( onClick );
}
out.write( “\”" );
}

if ( onDblClick != null ) writeAttribute( out, “ondblclick”, onDblClick );
if ( onMouseOut != null ) writeAttribute( out, “onmouseout”, onMouseOut );
if ( type != null ) {
writeAttribute( out, “type”, type );
} else {
writeAttribute( out, “type”, “button” );
}
if ( disabled ) {
writeAttribute( out, “disabled”, “disabled” );
}

out.write( ” > <div> <div> ” );

if ( icon != null ) {
out.write( icon );
out.write( ” ” );
}
out.write( “<span> ” );
out.write( content );
out.write( ” </span> </div> </div> </button>” );
}

[/source]
Now, I mentioned the torrid love affair we’re in at work. Well, we are using a whole lot more FTL to simply render HTML elements : inputs, labels, selects, buttons, checkboxes, creditCards, datePickers, dropDowns, addresses ( which use the previous elements ) and even style and javascript includes. So, in conclusion, since Freemarker is the artsy smartsy way to do things and it will be used wherever and whenever possible once it takes root – I implore you to only use it where/when it is truly needed. That is, for large views that are truly complex and for views that will be frequently customized. For the simple cases, it’s just not worth the hit.

Time to go refactoring…perhaps…

Afterthough::: If only the freemarker templates would compile into java bytecode, then this would probably be a nonfactor. JSP’s can do it. Is anyone up to the task?

Categories: Benchmarks, Open Source Tags:

C3P0 vs DBCP – The Straight Dope

November 12th, 2007 deevis 5 comments

I’ve found it very difficult to find any accurate comparisons of C3P0 and DBCP. Why should I choose C3P0 over DBCP? When might DBCP be better?

Well, here’s a fairly in-depth overview of both with pros and cons and similarities and differences included.

The first thing to note is that if you’re not in a multi-threaded environment, then DBCP is going to be faster than C3P0 and will also use significantly fewer connections than C3P0. For example, using the default settings for each and sizing the pool to contain 50 connections at most, will yield results like the following single-threaded test.

Single-threaded Benchmark – 50,000 calls to getConnection()

Trials MaxPoolSize Connections Used Seconds Settings
DBCP 50,000 50 1 5.18
C3P0 50,000 50 50 6.72 *numHelperThreads=3
C3P0 50,000 50 39 6.6 numHelperThreads=4
C3P0 50,000 50 27 6.45 numHelperThreads=5
C3P0 50,000 50 30 6.63 numHelperThreads=6

Suffice it to say that DBCP is obviously better suited to Single-threaded applications with high load. Needing 50 connections in the pool to do the work where only 1 is needed is certainly not desired. This is the case with C3P0’s default setting of “numHelperThreads=3″. Regardless of the environment you have ( single-threaded, multi-threaded ) I would always change this setting if you forsee high load on the program. I would always use at least “numHelperThreads=5″. As a bit of background/explanation, C3P0 doesn’t actually make a connection available in the pool when it is checked-in. Instead the HelperThreads will detect these and do the work to get them back in the pool. This is great in high-load, multi-threaded environments as it avoids blocking issues. But it inherently requires a much larger number of connections to provide the same functionality ( especially if you keep “numHelperThreads=3″ ). Also, notice that DBCP is 20% faster regardless of how we tweak numHelperThreads.

Score 2 points for DBCP. 2-0.

Next what if we run the same test with a smaller connection pool size – say of only 5 connections, and numHelperThreads=3 for C3P0?

The results are a bit counterintuitive ( to me at least ). DBCP does at least manage to stay the same – very fast and reasonable ( needing just 1 connection in its pool ). But C3P0, which I expected to take longer because it previously needed all 50 connections, but actually runs faster with a smaller pool : 6.25 seconds down from 6.7 seconds. The lesson here is that C3P0 is slowing itself down with the HelperThreads having to manage all the extra connections to get them back in the pool.

Trials MaxPoolSize Connections Used Seconds Settings
DBCP 50,000 5 1 5.18
C3P0 50,000 5 5 6.18 *numHelperThreads=3

Score 1 point each. DBCP for making sense and C3P0 for speeding up.

3-1, DBCP in the lead.

Now for the good stuff – the multi-threaded tests. I’ll be varying the number of Threads which will be running.
Multi-threaded tests

Threads MaxPoolSize Connections Used Seconds Settings
DBCP 25 5 5 90
DBCP 25 10 10 107
DBCP 25 25 25 167
DBCP 50 50 50 198
DBCP 50 100 50 207
C3P0 25 5 5 156 *numHelperThreads=3
C3P0 25 10 10 142 *numHelperThreads=3
C3P0 25 25 25 137 *numHelperThreads=3
C3P0 50 50 50 252 *numHelperThreads=3
C3P0 50 100 100 252 *numHelperThreads=3
C3P0 50 100 100 269 numHelperThreads=6

Score yet another to DBCP. Faster across the board on all accounts. So, what is it about C3P0 that has everybody ( myself included ) using it and singing its praises? Multiple threads yield multiple points again for DBCP – 2 points awarded.

5-1, C3P0 will need a miracle comeback at this point.

I know, usually Connections aren’t just checked out and run with a small SQL statement and immediately returned like the Benchmarks I’ve been running. Maybe I need to put some delay in there to more closely mimic real world performance. I’m going to throw a new setting in here which is a delay in milliseconds that each call to getConnection() will endure before finishing with the connection. I’ll give it a shot with an extra 100ms of sleep time after each SQL statement is run.

Multi-threaded – 100ms of sleep time added

Threads MaxPoolSize Connections Used Seconds Settings
DBCP 50 25 25 9338
DBCP 50 10 10 20918
DBCP 100 50 50 9248
DBCP 0 50 50 0
DBCP 0 100 50 0
C3P0 50 25 25 9295 numHelperThreads=6
C3P0 50 10 10 20088 numHelperThreads=6
C3P0 100 50 50 9412 numHelperThreads=6
C3P0 0 50 50 0 *numHelperThreads=3
C3P0 0 100 100 0 *numHelperThreads=3
C3P0 0 100 100 0 numHelperThreads=6
Categories: Benchmarks, Open Source Tags:

Coming soon – lots of benchmarks

October 27th, 2007 deevis No comments

Webservices: Apache Axis vs XFire

Application Servers: Tomcat, Jetty, Glassfish, JBoss

Frameworks: Spring MVC, Wicket, Seam, Struts

Categories: Benchmarks Tags: