File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

FileWriter Being Slow

 
pradeep selvaraj
Ranch Hand
Posts: 62
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I have this piece of code.




The PL.txt file is about 110 Mb and this program takes about 190 seconds to finish the task.

We have tried this in many of our systems in our place and depending on the processor speed we get a relatively consistent time.

But in our clients machines (which are must faster than our test machines) this same program takes about 2 hours. There is no error or anything its just that its too slow.

This happends only when we try to append files. Normal file copy takes the same time.

My question now is could the append in filewriter depend on something in the machine, like the hard disk speed or somoethig else causing the program to run differently in different machines?

Thanks for your help





 
Bill Cruise
Ranch Hand
Posts: 148
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Pradeep,

The speed of the disk will come in to play with your current implementation. The reason is that you're creating a new BufferedWriter for every line of the input file. This is very inefficient. Create the BufferedWriter before the while loop, and close it after the while loop is complete. This allows the VM to use the buffer to decide when to write to the physical disk.

Reply back and let us know how it works.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's writing to a different file every time through the loop. Part of the input line becomes part of the output filename. So you can't move the file open and close out of the loop. This is just a big job for a disk system!

On the chance that you're getting a lot of latency in open and close operations, I'd try splitting this out to several threads. Try a variety of numbers. You might max the CPU and cut down the overall time.

Are you comfortable with the thread pooling introduced in Java5? The code would look something like:

MyFileWriter would be a Runnable that has the code to open, write and close a file. This could wind up putting all the lines in queue very quickly and letting the file updates run in background if that's a good scenario. it should be a relatively quick experiment. Let us know if you try it.

Oh, and another thought ... can you replace the zillion-file design with something more efficient? Or are you stuck with it forever?
[ June 05, 2007: Message edited by: Stan James ]
 
Bill Cruise
Ranch Hand
Posts: 148
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stan, you're right. I missed the fact that they were all different files in the tiny code font. Your solution should work a lot better given that constraint.
 
pradeep selvaraj
Ranch Hand
Posts: 62
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks guys...
Yeah the prob was in opening and closing. I am ok with threads but tried this option n got amazing results....

It takes abt 15 seconds now from 190 in our test machine and there is not much difference in the speed in the client site as well.




I guess the disk speed at my client place is very very bad....or my first implementation would have worked in the same time there as well

problem solved
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ah! You know your data better than we do. You have multiple lines per file, so caching writers makes sense. Now I might worry about how many files are open at the same time. There are limits that vary widely by OS. If that's a problem you could try to get all lines into memory - maybe a List per file stored in a Map keyed by name - and write them out one file at a time. And then I could talk about threads again. Heh heh.

BTW: With your current cache of writers you can get the values from the map and iterate them. I bet you could eliminate the "cust" ArrayList. See Map.values()
[ June 06, 2007: Message edited by: Stan James ]
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic