I need to read a very large comma-separated file (.txt or .csv - depending on requirement) and want to be able to skip a few records if they are already been processed. Like if for some reasons the file is processed halfway, I need to be able to set the counter to the location from where to begin processing the next time.
If I read the file using SQL, in one example I saw strSQL = "select * from " + filepath was executed and the resultset was traversed. This way I will be able to move the record counter easily using the resultset methods.
I wanted to know if anyone has used this appraoch and if this is an efficient way of reading a large comma-separated file.
Anyone has a better idea or can make me aware of any flaws of the above approach, it will be nice.
I have not used a CSV/JDBC driver, but I would think that there is a significant overhead. Working directly on the file level might be more performant. There are ready-made helper classes like the Ostermiller CSV class which help with the reading of the file.
Ping & DNS - updated with new look and Ping home screen widget
Sounds like you want random access to the file (e.g., skip first half) and my inclination would be to use a RandomAccessFile. Memory mapped file facilities in Java are excellent.
Joined: Jun 18, 2001
I have done some trial and error on this. I tested the OsterMiller utility, while loop with StringTokenizer and while loop with my own token separator.
I checked the time and free memory.
Time taken by taking system time before and after processing. Then taking the difference. And free memory by taking - Runtime.getRuntime().freeMemory() before and after processing.
My comma separated files are gonna be huge so I dont want a logic that takes up a lot of memory. Is this free memory a proprer major of checking if lots of memory id being used or not? Is it foolproof in all the scenarios?
I feel the utility is actually requiring a lot of memory. Lot of memory gets allocated and that results in a bigger difference when free memory diff is calculated.
Anyone has good experience with the utility from performance point of view?