• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

OutOfMemoryError when using ResultSet

 
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My problem:
I have an input textfile of 1000 records (1000 lines) consisting of birthdate, personal id number and geographic location number.
I am to produce an output textfile by joining the input records with data in a database table (Sybase ASE 12.5 database, I connect to it with JDBC driver: JConnect5) of 5 000 000 rows plus (most or all of the input data is found in this table) and for each of the input records add four more records with identical geographic location number and year of birth. The output then is to consist of about 5000 records (lines).
I use JBuilder5 Enterprise.
Something that I have tried:
Use java.sql.*
Get a resultset from the database table
Statement statement = conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
ResultSet rs;
rs = statement.executeQuery(�select datepart(year, birthdate), datepart(month, birthdate), datepart(day, birthdate), personalid, locationid from <table> order by locationid, birthdate�);
This operation takes about nine minutes.
After a while of traversing the ResultSet an OutOfMemoryError is thrown.
I have outputted some memory values, by using Runtime methods totalMemory() and freeMemory(), during various stages:
Memory used after allocated rs: 1800520
Free memory after allocated rs: 718160
Memory used after processing two of the inputdata: 14874960
Memory used after processing four of the inputdata: 16731768
Memory used after processing six of the inputdata: 57160416
Then before eight inputdata have been processed:
Exception occurred during event dispatching:
java.lang.OutOfMemoryError
<<no stack trace available>>
I would very much appreciate tips and help to any possible solutions to this !
Any explanations to the workings of ResultSet, that has to do with this problem.
Ways to work around this problem.
Any other tips, solutions.
 
Ranch Hand
Posts: 102
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
Try to increase the buffer size to 128. this could be done through command line java -ms128m -ms128m <filename>. Please let me know if you are still facing the problem
Thanks
Kareem
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for your suggestion!
I've now tried to set memory options, I used -Xms300m -Xmx300m.
This makes it process more rows, but still only as much as to produce output for 66 of the 1000 inputrecords.
Any other suggestions?
 
Kareem Qureshi
Ranch Hand
Posts: 102
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
You say that the query takes nine minutes to execute then it could some flaw in the sql itself it is not clear from your sql. There could be cartesian product in the sql result. when u run the query out of JDBC how many rows it is returning. so can you eloborate on ur sql statement. It is not clear why u are setting the insensitive and concur read only options.
I just re read ur topic it says u are expecting 1000 rows and only 66 are returning. Are u doing other sql operations before this one?
try indexing your table and use indexes to retrieve because a lot of rows are involved.
Please let me know about it.
thanks
Kareem
 
Kareem Qureshi
Ranch Hand
Posts: 102
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also i didnt see any where clause, how many tables are involved?
Kareem
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Kareem,
In my query
select datepart(year, birthdate), datepart(month, birthdate), datepart(day, birthdate), personalid, locationid from <table> order by locationid, birthdate
I select every row, I have no where clause, in a table of 5 000 000 rows over a LAN. One table is involved.
Why I set the ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY properties is because I go forward and backward in the ResultSet to find matches to the inputs, and I only read from it.
First I find the input in the ResultSet, then I look for matches after this record to write to the outputfile. After I am done with one such group, I look for the next inputrecord, going forward or backward depending on the locationid and birthdate values of the inputrecord and the current row in the ResulSet.
I am only doing this SQL operation in this method.
It seems the memory used increases for each row that is accessed through next(), and that a certain amount of rows are fetched initially. Is it possible to free memory for rows I have accessed in the ResultSet that I won't access again?
I have tried to use where clauses in the query, to decrease the amount of rows fetched, and instead run quite a few smaller queries, but only one such query used very much time as well.
Example of such a query:
select datepart(year, birthdate), datepart(month, birthdate), datepart(day, birthdate), personalid, locationid from <table> where locationid = <input_locationid>, datepart(year, birthdate) = <input_birthyear> order by locationid, birthdate
tormod
 
Ranch Hand
Posts: 1879
MySQL Database Suse
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
usually the memory problem is outside of the JDBC code. Since the resultset does not retrieve all 5 000 000 rows into memory( only the amount specified by fetch size), it is unlikely this would be the cause of your memory error( but check with your JDBC driver known bug list anyways ). The likely cause is what you do to the 5 million rows that you retrieve. If you store them all in a Vector, or keep creating active objects in every loop, this is usually the cause. ensure that you clean up after yourself in every loop.
Jamie
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the comments!
I also thought it might be something else, but
I have tried to just walk through all the rows inside a loop
boolean b = true
System.out.println("before");
while(b){
rs.next();
}
System.out.println("after");
and the problem still applies while inside the above loop.
I also see in the driver documentation that when the resultset is scrollable, the memory used increases gradually as more rows is accessed. I didn't think this applied when I did some or all of the following:
-called setFetchSize(<small enough number>
-called setMaxRows(<small enough number>
-retrieved a SybCursorResultSet from executeQuery.
But I still get the same problem.
 
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I didnt see any code to set b to false inside the loop. Did you miss it in your application or just didnt post it here.
[ April 11, 2002: Message edited by: Uday Kumar ]
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i changed my code to this with the same result:
System.out.println("before");
while(rs.next()){
}
System.out.println("after");
System.exit(0);
tormod
 
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Tormod, I may have misunderstood your problem but here goes an idea.
Rewrite your code to do the following:
- create a PreparedStatement with a query to retrieve a ResultSet only for a single record in your input file
- open the input file
- read first record
- set the values from the input record in the placeholders of the PreparedStatement query
- execute it. You'll receive as ResultSet only the rows for that input record.
- for each row in the ResultSet do your processing and generate an output record
- read next record in the input file
- repeat previous 4 steps until the program reaches the end of the input file
The key in this idea is the query you are going to use to retrieve rows for a particular input record, i.e, the predicate you are going to use for that. You probably are using one already when scrolling the ResultSet but I couldn't get it from your explanations.
If the query for each record takes more time than you expect then try to add an index to the table based on the query predicate as Kareem Qureshi suggested in a previous post.
Hope I understood the problem correctly and didn't say anything stupid.
Best regards.
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi.
Been away for some days now..
Thanks for your input!
I have tried something similar to what you suggest, but not with a prepared statement. I tried to get smaller result sets, to not come into the memory problems. The problem I faced, though, was time. It's probably partly an index thing, but as this application i am working on is not to work only against a single table, the tables' indexes is not something i fully control. What's good is that this table is more than likely the one with the largest amount of data the app will stumble into..
I will try to test time used with a prepared statement as you suggest.
Thanks.
 
Paulo Salgado
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Tormod.
Just adding something to your response. I may be wrong here so anybody please correct me where necessary.
Whether you use a small ResultSet or you scroll through the whole table, indexes will be necessary to improve the performance of your application. Whenever you need to find something and it is not the key for the table, an index for your search will make the DB life much easier.
You haven't felt the performance problem with the full ResultSet probably because you haven't got it working so far. I mean, after you solve your memory problem your next one will be performance.
So, if you manage to solve performance now with a small ResultSet, you don't need to solve the memory problem.
Best regards.
 
tormod eriksen
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There is nine nonclustered indexes on the table, both the columns I need indexes on are among these.
Can someone think of the expected timeframe to get a resultset like this:
select birthdate, personal_id, geographic_id from <table> where datepart(year(birthdate)) = birthyear and geographic_id = geo_id
from a table with 5 million rows of personal data for persons born since 1900 and with about 600 geographic ids?
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic