aspose file tools*
The moose likes Performance and the fly likes IO Performance problem Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "IO Performance problem" Watch "IO Performance problem" New topic
Author

IO Performance problem

Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
I have a Java program doing some IO that's about 2x slower than the corresponding C code and would like help optimizing the Java version. The C version goes in an average of 64 seconds and the Java in about 127, not counting the time for the JVM to load and start. The task is to read many (~6000) small text files from a given directory. The salient part of the Java code is:

The text files look like:
AREA_NAME_GOES_HERE
40.67281150817,22.93920917511
40.23754310607,22.93920917511
40.23754310607,22.50393657684
40.67281150817,22.50393657684
END
In case it matters, I'm using Java version "Classic VM (build JDK-1.2-V, native threads)" and Borland's freebie C++ compiler version 5.5 on Win 2000 with plenty of memory (224). To get consistent runtimes, I'm testing each just after a reboot to insure they're not penalized by other stuff running or benefiting from files cached in memory. Suggestions...?
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20639
    ∞

I would start by making it multi threaded. Maybe ten threads running at once. There is a lot of time sucked up with opening a file.
Next, rather than using the StringTokenizer stuff, I would use the much faster String.indexOf(',');
Next, string comparison stuff is pretty slow. For Java as well as C. I hate while loops that contain a lot of stuff. And I hate seeing code duplicated outside of the loop for initialization.
Closing your buffered reader will close all the other file stuff.
try this:



permaculture Wood Burning Stoves 2.0 - 4-DVD set
Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
Thanks for the suggestions and code, Sherrif. The leaner string / parsing stuff whittled a little off the time -- down to an average of 124 from 127 seconds. I'll have a go at multi-threading. Should be interesting; haven't done threads in Java yet -- good chance to explore that area.
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20639
    ∞

Heavy I/O is the primary reason we have threads in Java. Check out the books listed at the top of the threads forum.
What O/S and VM are you using? I wonder if the VM you are using is clunky.

Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
I'm using Java version "Classic VM (build JDK-1.2-V, native threads)" and Borland's freebie C++ compiler version 5.5 on Win 2000 with plenty of memory (224).
The first (crude) whack at a multi-threaded loader has shaved some more off the time. It's down to ~93 seconds from ~124 using 10 threads. Now that I understand the basics, I'm going to refactor my initial implementation to clean up the architecture and hopefully further reduce the time. I'll report back on that in a day or so.
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20639
    ∞

So is your VM the Sun VM?
I suspect that the VM could do a bit more optimizing with I/O stuff.
I suppose that increasing the buffer size won't make much difference since the file sizes are already so small?
Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
Uh, dunno. How can I tell? java -version just says:
java version "1.2"
Classic VM (build JDK-1.2-V, native threads)
Can you recommend where I can get a better VM?
Yeah, bigger buffer definitely seems unlikely to help. Most files are just 6 lines long.
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20639
    ∞

Mine says:
java version "1.2.2"
Classic VM (build JDK-1.2.2-W, native threads, symcjit)
And I know that this is the Sun VM. The "jit" part on mine should give a huge amount of optimization.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Also try jdk 1.3, which uses HotSpot by default, which usually speeds things up nicely.


"I'm not back." - Bill Harding, Twister
Bill Compton
Ranch Hand

Joined: Aug 26, 2000
Posts: 186
I loaded jdk 1.3 and that made a big difference. Between this change and the others (multiple threads, simpler string stuff) the Java code is down to the same runtime as the (unoptimized) C version. This is probably good enough for our purposes. Thanks for all the helpful suggestions!
Jack Shirazi
Author
Ranch Hand

Joined: Oct 26, 2000
Posts: 96
I would guess that the C code doesn't use two byte characters, nor String objects. For this kind of task, converting bytes into chars and creating multiple String objects both impose significant overheads. Once again (see the 'speed of Integer' thread), Java provides you with the ability to get maximum speed, but to do so your code ends up looking very similar to the C code. It depends one whether the speed is more important than using good object-oriented coding.
 
wood burning stoves
 
subject: IO Performance problem