The speed of the disk will come in to play with your current implementation. The reason is that you're creating a new BufferedWriter for every line of the input file. This is very inefficient. Create the BufferedWriter before the while loop, and close it after the while loop is complete. This allows the VM to use the buffer to decide when to write to the physical disk.
It's writing to a different file every time through the loop. Part of the input line becomes part of the output filename. So you can't move the file open and close out of the loop. This is just a big job for a disk system!
On the chance that you're getting a lot of latency in open and close operations, I'd try splitting this out to several threads. Try a variety of numbers. You might max the CPU and cut down the overall time.
Are you comfortable with the thread pooling introduced in Java5? The code would look something like:
MyFileWriter would be a Runnable that has the code to open, write and close a file. This could wind up putting all the lines in queue very quickly and letting the file updates run in background if that's a good scenario. it should be a relatively quick experiment. Let us know if you try it.
Oh, and another thought ... can you replace the zillion-file design with something more efficient? Or are you stuck with it forever? [ June 05, 2007: Message edited by: Stan James ]
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Joined: Jun 01, 2007
Stan, you're right. I missed the fact that they were all different files in the tiny code font. Your solution should work a lot better given that constraint.
Joined: Nov 29, 2005
Thanks guys... Yeah the prob was in opening and closing. I am ok with threads but tried this option n got amazing results....
It takes abt 15 seconds now from 190 in our test machine and there is not much difference in the speed in the client site as well.
I guess the disk speed at my client place is very very bad....or my first implementation would have worked in the same time there as well
Joined: Jan 29, 2003
Ah! You know your data better than we do. You have multiple lines per file, so caching writers makes sense. Now I might worry about how many files are open at the same time. There are limits that vary widely by OS. If that's a problem you could try to get all lines into memory - maybe a List per file stored in a Map keyed by name - and write them out one file at a time. And then I could talk about threads again. Heh heh.
BTW: With your current cache of writers you can get the values from the map and iterate them. I bet you could eliminate the "cust" ArrayList. See Map.values() [ June 06, 2007: Message edited by: Stan James ]