Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Memory Leak processing CSV with openCSV

 
Marc Cracco
Ranch Hand
Posts: 80
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm using openCSV to read through a file that has 500K rows, for each row it must query a table to see if a record exist and then if it does it creates a new record on another table.
Then I have another csv file that, using openCSV, i write the result to.

I seem to have a memory leak that within reading 100K records grows my app a gig. I'm not sure if anyone has worked with openCSV and could maybe catch something obvious. I
think I've cleared my two method calls from being the culprits. getAccountPartyIdByEmail() and addPartyToPartyGroup()

Profiling this app can suck and I hoped to be spared doing so....


 
Hebert Coelho
Ranch Hand
Posts: 754
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you put this writer.flush(); in a finally, it may help.
 
Marc Cracco
Ranch Hand
Posts: 80
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hebert Coelho wrote:If you put this writer.flush(); in a finally, it may help.


I can add but it never exits the loop, out of heap before it get's past 20% of file. It crashes while in the loop started on 28.

EDIT: Also the writer is being flushed after each iteration so unless an exception is thrown at which point the result file is useless I don't think I'd need to do it more.
 
Paul Clapham
Sheriff
Pie
Posts: 20959
31
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Marc Cracco wrote:I think I've cleared my two method calls from being the culprits. getAccountPartyIdByEmail() and addPartyToPartyGroup()


Well, I was going to point the finger at them. But obviously that's because I know nothing about them. Which is exactly your approach to the openCSV code, isn't it? So if you don't want to fire up the profiler, I think your alternative is to dig into the openCSV code and see what you can find. Me, I'd choose the profiler.

Edit: you have done some debugging, haven't you? It isn't the case that some glitch is parsing the whole file into a giant array the first time you hit line 28, or something like that?
 
Marc Cracco
Ranch Hand
Posts: 80
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:
Edit: you have done some debugging, haven't you? It isn't the case that some glitch is parsing the whole file into a giant array the first time you hit line 28, or something like that?


Yeah, I've stepped through the code at different iterations of the loop. You can see the footprint of the app grow as it iterates. It's not a one time chews it all but is incremental.

EDIT: at start footprint 400MB, 50K lines in 1GB, 100K lines in 2GB OutOfMemoryError.... Heap death....
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic