• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Pulling millions of unique values from Oracle?

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm rewriting an app that pulls a dozen data items for about six million rows in an Oracle database. About half of these are strings, and a majority are unique (physical addresses and 10-digit phone numbers).
I've been profiling my memory usage, and the garbage collector gets me back down to a fairly constant size every time it runs, but it gets behind at the threshold, which means it allocates a little extra space before clearing the memory, and the heap gets bigger every cycle.
Is there a way of pulling them (or streaming them) directly into a StringBuffer, to prevent the string pool from getting enormous from all the unique values? I've been trying getAsciiStream and getCharacterStream but not having much luck.
Alternately, is there a way to fiddle with the GC to prevent the reallocate before the collection?
Has anyone run into a problem like this before? Am I barking up the wrong tree?
Thanks in advance,
Bill
 
Ranch Hand
Posts: 156
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Let me see if I understand what you're doing. You're doing a database query that returns millions of rows, looping through the result set, doing something with each row in turn (printing a mailing label or something). The key point is that once you finish an iteration of the loop, you're done with that row. So you'd expect the memory usage to have a sawtooth pattern, droping down to the same constant low point each time garbage collection completes. But instead each low point is little higher than the previous low point. Is that correct?
 
Bill Gathen
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The bottom of the sawtooth is constant. It's not a textbook memory leak, where something is not being dereferenced and lives until the program dies.
It's the *top* of the sawtooth that I'm concerned with. If the initial allocated heap is 2 meg, it seems to let the objects pile up until just below 2 meg, then run gc. The used memory drops drastically, then starts building back up.
The core problem seems to be that between the time the gc decides to run again and the time it actually starts freeing memory, the main thread has added a couple more objects (running past 2 meg) and has to allocate more space for the heap. Now the allocated heap space is bigger, so it goes longer before gc'ing the next time, with same lag problem increasing the size yet again. Repeat x,000 times and the heap has gotten very large.
 
reply
    Bookmark Topic Watch Topic
  • New Topic