aspose file tools*
The moose likes I/O and Streams and the fly likes Sorting a large CSV file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Sorting a large CSV file" Watch "Sorting a large CSV file" New topic
Author

Sorting a large CSV file

Tom Roffe
Greenhorn

Joined: Feb 10, 2004
Posts: 4
hello all,
just a quick one, this is my first post to the forum and i like what im seeing ;-) . Any way i have a bit of a problem. i have a 180Mb CSV file and i need to sort it in name and post code (ZIP) order. i have a small app to sort it and it works fine on small files, but when i try to sort the beast of a CSV it gives me

i also have a smaller CSV around 20Mb and that gives the same error. does any one have some suggestions if so they would be greatly appreciated.
CODE BELOW
thanks in advance..
Tom Roffe
Dmitry Melnik
Ranch Hand

Joined: Dec 18, 2003
Posts: 328
Hi Tom. Here are a few things to try:
1. Run the virtual machine with this parameter "-Xmx1500m" It will set the memory size for VM, the default one is too small for your task.
2. Optimize the memory usage of your task. For instance, instead of inserting into the list raw strings read from the input file, insert arrays of strings (the results of split), and update your comparator and result file output code accordingly.
3. You might want to sort your data by inserting helper objects representing your input strings into sorted collection (like TreeSet). You will need to take care of duplicate keys though.
Tom Roffe
Greenhorn

Joined: Feb 10, 2004
Posts: 4
BIG Thanks Dmitry Melnik, worked treat.
one other problem has just presented it's ugly self. The input file has some entries that are in CAPS and others arn't. When the program sorts the list the CAP'ized entries are at the top of the list sorted and the non-caps entrys are sorted but at the EOF.
Question, how can i make the sort process ingore the case of the file entries.
Dmitry Melnik
Ranch Hand

Joined: Dec 18, 2003
Posts: 328
Before you start sorting convert to the same case (with toLower(), toUpper()) the strings you compare.
 
wood burning stoves
 
subject: Sorting a large CSV file