This week's book giveaway is in the OCMJEA forum.
We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line!
See this thread for details.
The moose likes Performance and the fly likes Math.sqrt() 1000x slower on Solaris than Win? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "Math.sqrt() 1000x slower on Solaris than Win?" Watch "Math.sqrt() 1000x slower on Solaris than Win?" New topic
Author

Math.sqrt() 1000x slower on Solaris than Win?

Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
We're tracking down a performance problem and it appears that Math.sqrt() is about one thousand times slower on Solaris than Windows. Test code and JVM versions are below. Anyone heard of this or know a good solution?

JVM on Windows is:
$ java -version
java version "1.5.0_10"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_10-b03)
Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed mode, sharing)

JVM on Solaris is:
$ java -version
java version "1.4.2_08"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_08-b03)
Java HotSpot(TM) Client VM (build 1.4.2_08-b03, mixed mode)

Test code:
[ February 08, 2007: Message edited by: Bill Compton ]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Interesting. Doesn't sound familiar. I don't have a Solaris box; a quick test on Windows indicates this isn't a difference between JDK 1.4 and JDK 1.5 - at least not on the Windows side. The sqrt is ultimately implemented in native code though, so implementations could be very different. The bug database didn't seem to have any good match to what you describe - though it may well be worth looking through more carefully than I did.

I'm not familiar with Solaris architecture, but on Windows the sqrt would probably be evaluated by a math coprocessor (if available), right? So this could be some hardware difference between the two machines. It might be useful to run some other math benchmarks on the two machines. Is the Solaris box always a thousand times slower doing math? Or is it just for sqrt()?

It might also be nice to hear if these results hold up on other Solaris boxes, maybe with other JDKs. Does anyone here have one handy? Please try it let us know how sqrt() performs.

One possible workaround is to replace Math.sqrt(a) with Math.exp(Math.log(a)/2). This would normally be slower and less accurate, but it sure shouldn't me a thousand times slower, so maybe it will help. Or if exp() and log() are similarly slow on the Solaris box, but basic math operators are not, you could also try your own implementation of Newton's method - I think that's (still?) the standard way to implement a sqrt() internally. Hmmm, maybe not - Wolfram lists a few alternates; I'm not sure offhand which is fastest or most accurate. But if the problem is just with the sqrt() function, then any of those methods might be worth a try.
[ February 08, 2007: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Keith Lynn
Ranch Hand

Joined: Feb 07, 2005
Posts: 2367
I have a Solaris box running Solaris 7, and JDK 1.5.0_06. On my Windows box I have JDK 1.5.0_02.

I modified your code slightly to get an estimate on the amount of time it takes.



On the Windows box, it took approximately 2156 milliseconds, and on the Solaris box, it took approximately 11654 milliseconds. The machine itself is rather old and slow, and so some of the time that it took may have been due to that instead of the Math.sqrt.
Tim Cao
Ranch Hand

Joined: Jul 26, 2004
Posts: 37
Any chance of JVM optimization kicked in? The variable a is loop-invariant and thus b is too. Maybe Windows JVM does this and Solaris doesnt.


Originally posted by Bill Compton:
for(int i=0;i<iterations;i++)
{
double b;
b = Math.sqrt(a);
if ( b == 1.23455 )
{
System.out.println(b);
}
}
[ February 08, 2007: Message edited by: Bill Compton ][/QB]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Interesting idea, Tim. To do that, HotSpot would also have to analyze the sqrt() code and determine that it had no side effects - and considering it's native code, it would be fairly difficult to implement something like that, I think. In any event, modifying the code to generate a different value of a on each iteration does not seem to have an appreciable affect on the running time on Windows. So it appears that this optimization is not occurring on Windows.
Stu Thompson
Hooplehead
Ranch Hand

Joined: Jun 14, 2006
Posts: 136
Hi,

Exactly what CPU are you running Solaris on?

I found this thread very facinating and when out on 'the google' and found this post: http://forum.java.sun.com/thread.jspa?messageID=9384302&tstart=0#9384302

In a nut shell, the UltraSPARC T1 processor's FPU does *not* include a sqrt instruction. From the link: "computing a sqrt would take 25000 cycles...When executing a sqrt the programs traps into the kernel to emulate the instruction."

25000 cycles seems like alot and has that a couple orders of magnatude scale your problem has. Maybe this is the root (haha) of the problem?

Stu


"This is not to say that design is unnecessary. But after a certain point, design is just speculation." --Philip Chu
Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
Hmmm... uname -a reports:
... sun4v sparc SUNW,Sun-Fire-T200
Does that tell us what processor it is?
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

I don't know, how similar solaris is to linux.
On Linux I do a
to get more information on my cpu.

The program needs 1.2 s for a=1024 on my 2.0 Ghz Centrino with linux by first invokation - 0.7 s when started multiple times.
But I don't have a parallel solaris-installation to compare.

But you could compare it to a c/c++ ad-hoc-translation:

compile:

On my system: first invocation: 2.0 s, multiple invocations: 0.9 s.
I don't like the speed of native code, but I would like to have those ...
cout << "Statements";
[ April 11, 2007: Message edited by: Stefan Wagner ]

http://home.arcor.de/hirnstrom/bewerbung
Stu Thompson
Hooplehead
Ranch Hand

Joined: Jun 14, 2006
Posts: 136
Originally posted by Bill Compton:
Hmmm... uname -a reports:
... sun4v sparc SUNW,Sun-Fire-T200
Does that tell us what processor it is?


I think this is the problem!

Wikipedia has a decent breakdown of Sun Fire servers...and from The Register, it seems that T200 was the initial name for what is now the T2000.

The Sun Fire T2000 has an UltraSPARC T-series CPU. (This chip is also known as Niagra and is notable for having multiple cores running multiple threads.)

If you google 'niagra fpu' you will see many results regarding floating point performance issues.
[ April 11, 2007: Message edited by: Stu Thompson ]
Kevin Mangold
Greenhorn

Joined: Jan 13, 2005
Posts: 18
sqrt() is much more processor intensive than raising a number to a power. Try to square all your data so you can get rid of the need for sqrt(). All 3D game engines do this to help increase performance.

Brief example, if you are trying to find the distance between two points:
distance = sqrt( Math.pow(x2 - x2, 2) + Math.pow(y2-y1, 2) );
Get rid of the sqrt() by squaring distance.

Unless, of course, you need it to be the square root.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Math.sqrt() 1000x slower on Solaris than Win?