File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Odd issue - any ideas? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Odd issue - any ideas?" Watch "Odd issue - any ideas?" New topic
Author

Odd issue - any ideas?

Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

I've run into a very odd performance issue with an enum I've just written. I've added the thread here because this is not a question about improving performance, but more about what people think the compiler (or Eclipse, which I'm using to test) might be doing.

The enum (called Bits) contains a bunch of bit-based utilities for each integer type, so obviously performance is quite important. I've therefore written a test module that runs each method a billion times in order to get an idea of how long they take.

One such method is called bit(index), which returns an int with a bit set at the supplied index, adjusted for the type on which it's called (assumed to be in the right-hand end of the int).
So, BYTE.bit(4) returns an int with bit 28 set and SHORT.bit(4) returns an int with bit 20 set. Obviously, INT.bit(4) returns an int with bit 4 set.

There is also a static final version of this method, called only(int), which basically replicates the function of INT.bit(), but doesn't have to do any index adjustment.

Hopefully you follow me so far. Now to my problem:

First, I would have assumed that only() would be slightly faster than bits(), but if I simply execute Bits.only() or Bits.INT.bit() in test code, the first takes, on average, about 9ns and the 2nd about 7ns.
BUT, if I include the following statement:
Bits iBits = Bits.INT;
the first now takes 1.1 ns, and the 2nd 1.5!

I don't even use the field, and yet it makes both methods run much faster, and also behave as I would expect.

Does anyone have any idea what might be going on here?

Winston

PS: I'm happy to include the code if people think it will help; but I'm not sure that it'll add anything to what I've described.


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14150
    
  18

It's hard to say. Maybe it's in your code, or maybe it's in the code that is doing the measurement. The JVM and JIT do many sophisticated optimizations, which can lead to surprising results when doing (micro-)benchmarks of Java code. Can you post your code, and the code that you use to run the test? Maybe there's something that you're overlooking in there that someone else might spot.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

I'm with Jesper on the "provide an SSCCE" part (but then, you knew I would be). However, one might speculate that, among that hoodoo magic optimization that Jesper mentioned, having the variable there results in Bint.INT being preloaded and cached, and later references to Bit.INT being recognized as that same value, whereas without it, for whatever reason, it's not noticed that all those Bits.INT reference the same thing, and so it's looked up each time.

Just a little Friday mornin' guessin'.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

Jesper de Jong wrote:Can you post your code, and the code that you use to run the test?

Ooof. OK, but I'll have to do a lot of pruning to give you an SSCCE for the Bits class.
Here's the code for the test module though (sorry for delay; phone):the time (System.nanoseconds()) is only taken when s.start() and s.stopMultiple() are executed, and the difference is divided by times. I've used the Stopwatch class for a lot of testing, so I'm pretty confident that (a) it works, and (b) it affects overall timings minimally.

With line 16 included, timings are:
Event: bit(15)
Number of timings: 1000000000
Total elapse time: 1.129430440s
Total active time: 1.129430440s
Avg. active time: 1.129ns

Event: Bits.bit(15)
Number of timings: 1000000000
Total elapse time: 1.514076280s
Total active time: 1.514076280s
Avg. active time: 1.514ns

Event: Bits.only(15)
Number of timings: 1000000000
Total elapse time: 1.140658400s
Total active time: 1.140658400s
Avg. active time: 1.141ns


With it commented out, timings are:
Event: bit(15)
Number of timings: 1000000000
Total elapse time: 1.101924640s
Total active time: 1.101924640s
Avg. active time: 1.102ns

Event: Bits.bit(15)
Number of timings: 1000000000
Total elapse time: 7.143473600s
Total active time: 7.143473600s
Avg. active time: 7.143ns

Event: Bits.only(15)
Number of timings: 1000000000
Total elapse time: 8.945839840s
Total active time: 8.945839840s
Avg. active time: 8.946ns


Hope it helps. I'll include the "pruned" version as soon as I can.

Winston
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

Jeff Verdegan wrote:However, one might speculate that, among that hoodoo magic optimization that Jesper mentioned, having the variable there results in Bint.INT being preloaded and cached, and later references to Bit.INT being recognized as that same value, whereas without it, for whatever reason, it's not noticed that all those Bits.INT reference the same thing, and so it's looked up each time.

Just a little Friday mornin' guessin'.

Oddly enough, that was my thought too (obviously great minds think alike), but it doesn't explain why it would affect the static method, nor why that one should take longer to execute without it.

It's no great worry (Knuth et al - and 9ns is still pretty darn fast ), but it does puzzle me.

Winston
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

The test looks about right to me. I certainly don't see anything to suggest other than the mysterious hand-wavy "explanation" I previously gave. I'd be very surprised if posting the Bits enum changes that.

Maybe somebody smart will have a more concrete idea though.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Winston Gutkowski wrote:
Jeff Verdegan wrote:However, one might speculate that, among that hoodoo magic optimization that Jesper mentioned, having the variable there results in Bint.INT being preloaded and cached, and later references to Bit.INT being recognized as that same value, whereas without it, for whatever reason, it's not noticed that all those Bits.INT reference the same thing, and so it's looked up each time.

Just a little Friday mornin' guessin'.

Oddly enough, that was my thought too (obviously great minds think alike), but it doesn't explain why it would affect the static method, nor why that one should take longer to execute without it.


I could imagine that, with or without the magical line, the static method gets inlined (after 10k or so executions), so that ultimately, both the static and non-static tests are executing the same code. Then, if the "caching" comes in on top of that...
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14150
    
  18

You could try disassembling the bytecode with javap -c to see if there is any significant difference between the versions with and without the line.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

Jeff Verdegan wrote:I'm with Jesper on the "provide an SSCCE" part

OK, here's my "pruned down" Bits class (still quite a mouthful, but a LOT smaller than the original).

I also ran it against the same test code and it produces the same results, so it does seem to be some intrinsic optimization:Any opinions would be gratefully received.

Winston
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

Jesper de Jong wrote:You could try disassembling the bytecode with javap -c to see if there is any significant difference between the versions with and without the line.

Good thinking. Any idea how to get Eclipse to run 'javap' for you?

Winston
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19693
    
  20

What's wrong with using the command line?

I can perhaps understand why the first block (Bits.bit) takes longer without the Bits iBits = Bits.INT; line - the loading of class Bits is moved from before the time measurements to within them. That doesn't explain (at least to me) why the second block (Bits.only) takes longer as well. The class is already loaded.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7794
    
  21

Rob Spoor wrote:What's wrong with using the command line?

Yes of course; it's not like a compile. I'll give it a bash.

That doesn't explain (at least to me) why the second block (Bits.only) takes longer as well. The class is already loaded.

Seems we're thinking along the same lines. I'll report back if javap reveals anything startling; although my bytecode instruction knowledge is very sketchy.

Winston
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Odd issue - any ideas?