matches will probably be slower since it uses a java.util.regex.Pattern and java.util.regex.Matcher in the background. Both equals and compareTo use a simple loop, and should therefore be faster.
That said, don't optimize prematurely, and go for what's natural. For checking if two strings are equal I would always use equals, never compareTo. Even if that one is faster (which may or may not be the case), the difference will be milliseconds at worst. I then prefer readability over micro-optimization.
Taking a pure guess, I would say that the regex is the slowest, because it has to compile the regex -- which is probably slower, by itself, than the comparison. As for the other two, I recommend writing testing code that measures it.
Wow... Rob beat me by six seconds !!
Henry
This message was edited 1 time. Last update was at by Henry Wong
I second that motion. For timing comparisons you should run each test for at least several thousand iterations. If possible you should get rid of JIT influences; I thinkHenry posted how to do that a few weeks ago.
Andraz Poje
Ranch Hand
Joined: Sep 25, 2008
Posts: 32
posted
0
Jan Cumps wrote:
Andraz Poje wrote:Is anybody interested to write a test code to measure comparison speed?
You?
Why not ;)
Result is:
1
0
0
How do I get more precise result?
Rob Prime wrote: I think Henry posted how to do that a few weeks ago.
I probably did, but I don't mind doing it again...
1. Get rid of everything that is not relevant. Don't do the "if" or the "System.out", as it is the same in all three cases, so why bother measuring it?
2. Be careful with short strings, as it is likely that you are measuring the setup more than the actual comparison. Also, don't make it obviously not equal, as there are probably short circuit code with a few of the options. Maybe measure two strings, where they start off the same, but are different later.
3. Do it in a loop. Maybe do the operation a million times. And measure the million times as a single number.
4. Make sure you have tons of memory, and do a gc() prior to taking the start time, just in case (to avoid taking a gc hit during the measurement).
5. Run it many many times -- meaning do the three tests in a loop over and over. This way, you can ignore the first few runs, in order to discount the JIT. This will also allow you to discount outliers caused by the GC too.
Henry
This message was edited 2 times. Last update was at by Henry Wong
Rob Prime wrote: I think Henry posted how to do that a few weeks ago.
I probably did, but I don't mind doing it again...
1. Get rid of everything that is not relevant. Don't do the "if" or the "System.out", as it is the same in all three cases, so why bother measuring it?
2. Be careful with short strings, as it is likely that you are measuring the setup more than the actual comparison. Also, don't make it obviously not equal, as there are probably short circuit code with a few of the options. Maybe measure two strings, where they start off the same, but are different later.
3. Do it in a loop. Maybe do the operation a million times. And measure the million times as a single number.
4. Make sure you have tons of memory, and do a gc() prior to taking the start time, just in case (to avoid taking a gc hit during the measurement).
5. Run it many many times -- meaning do the three tests in a loop over and over. This way, you can ignore the first few runs, in order to discount the JIT. This will also allow you to discount outliers caused by the GC too.
Henry
Thanks....
Result:
279 9 13
No doubt, equals is the fastest.....Right?
Also, I recommend taking the start and end time out of the loop. There are always some inaccuracies when taking a clock sample, and doing it in the loop means that the inaccuracy is compounded a million times.
This, of course, means that you will now need three different loops -- instead of one big one.
Henry
This message was edited 1 time. Last update was at by Henry Wong
I ran each test a couple of times. From this it would seem that == is several fold faster than .equals(). I learned somewhere that strings should always be compared with .equals() in java. I suspect the difference here is in the way that the compiler can optimize the loop, so this might not represent the performance you would find in real world code.
This message was edited 1 time. Last update was at by Mike Thon
Comparing String objects with == might not do what you expect here. It checks if the left side of the == and the right side of the == are the same object. Not that they hold the same string value.
Here is an overview of the valid comparision features of the Java language.
Regards, Jan
Istvan Kovacs
Ranch Hand
Joined: May 06, 2010
Posts: 98
posted
0
Mike Thon wrote:Why not use == for comparing strings?
As Jan has pointed out, you're comparing object references, not string values that way.
It may work in some cases, but is not reliable.
Why it may work:
- the compiler/VM reuse the same String object to represent the same value when it's used several times.
- the compiler actually performs String concatenation for expressions where it can figure out the value
- Strings may be 'interned' (there is a 'canonical' representation for each String value that you can use to save memory - or leak memory, if you are not careful, see http://www.javamex.com/tutorials/memory/string_saving_memory.shtml)
Check out the following, and play around until you understand what's going on (or until you get utterly confused, in which case ask ).
For fun, try to guess the output.
This message was edited 1 time. Last update was at by Istvan Kovacs
Istvan Kovacs
Ranch Hand
Joined: May 06, 2010
Posts: 98
posted
0
Andraz Poje wrote:
How do I get more precise result?
Besides running many times and making sure you get the JIT compiler to actually compile your code before benchmarking, you may want to check out System.nanoTime().
R van Vliet
Ranch Hand
Joined: Nov 10, 2007
Posts: 144
posted
0
Istvan Kovacs wrote:
Andraz Poje wrote:
How do I get more precise result?
Besides running many times and making sure you get the JIT compiler to actually compile your code before benchmarking, you may want to check out System.nanoTime().
Are there any VM implementations where System.nanoTime() isn't System.currentMillis() * 1000? I have yet to run into any. Also, the granularity of System.currentMillis() is so ridiculously unpredictable and high that you need a LOT of iterations of naturally quick operations to get accurate timing information. You can avoid JIT based inaccuracies by running your complete test a couple of times within a single execution cycle of your program so you're basically doing :
equals test
regex test
compare test
equals test
regex test
compare test
equals test
regex test
compare test
etc.
Dismiss the first X as potential runs that were influenced by a lack of JIT compilation and off you go. Now...
All that said, this seems a rather theoretical excersize because I sincerely doubt there are real world programs where switching from compareTo() to equals() provides a significant performance improvement. Add to that that equals() is the most obvious option anyway it should rarely be a problem.
By the way, the performance differences can be explained quite easily :
1) match uses regex which need to be compiled and pattern matched, as explained above
2) equals generally performs better than compare because it has an early-out before it enters its inner loop that's almost always used, namely s1.length() != s2.length(). A compare needs to do the s1.charAt(i) - s2.charAt(i) for the first different character regardless of string length. You will find different results if you always compare equal length strings. Or more specifically, you will see that compareTo and equals performance will be similar.
This message was edited 1 time. Last update was at by R van Vliet
Istvan Kovacs
Ranch Hand
Joined: May 06, 2010
Posts: 98
posted
0
R van Vliet wrote:Are there any VM implementations where System.nanoTime() isn't System.currentMillis() * 1000? I have yet to run into any.
> java -version java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) Client VM (build 16.3-b01, mixed mode, sharing)
Running on 32-bit Windows XP.
Also tried on Linux (32-bit, uname -a: 2.6.31-21-generic #59-Ubuntu SMP Wed Mar 24 07:28:56 UTC 2010 i686 GNU/Linux), nanos was not a multiple of some power of 10.
BTW, the multiplier between milli and nano is 1,000,000, not 1,000
This message was edited 3 times. Last update was at by Istvan Kovacs