wood burning stoves 2.0*
The moose likes Object Relational Mapping and the fly likes hibernate in a batch program suggestion Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Databases » Object Relational Mapping
Bookmark "hibernate in a batch program suggestion" Watch "hibernate in a batch program suggestion" New topic
Author

hibernate in a batch program suggestion

manish ahuja
Ranch Hand

Joined: Oct 23, 2003
Posts: 312
Hi All,

I have to run a batch program which contains 100k records to process.
The java program has to fetch 100k records from the database and process each record. Once the process is complete the java program updates the record in the database with the process status.
We are using Hibernate O/R mapping framework for all database interactions.

I am trying to figure out the best approach to tackle this. First approach is I load all 100k records via Hibernate Criteria list, carry out the java processing and then update individual record in the database with the status. But the issue I encounter is it runs into an Out of Memory space error as each record is of considerable size comprising several data columns.
The other approach I tried is to load only those columns (using Hibernate Projections) for the 100k records required for processing in java. This approach solves the 'out of space memory ' problem I run into as mentioned in the first approach but there is downside to this. Since I am only grabbing the selected column of the record object I end up with partial Hibernate entity object not good enough for persisting back to the database. So after processing each record I have to reload back the full Hibernate entity object in order carry out the update. This works but is significant performance overhead considering there will be 100k database roundtrips.

Please suggest what would be a good approach for handling the aforementioned problem.

Thanks,
M
Cameron Wallace McKenzie
author and cow tipper
Saloon Keeper

Joined: Aug 26, 2006
Posts: 4968
    
    1

I'm not even sure if Hibernate is the right tool here. Typically batch processing can be done more effectively and efficiently by native database tools, ETL programs, and other options. Does it have to be an ORM mapping tool?

-Cameron McKenzie
Christian Dillinger
Ranch Hand

Joined: Jul 20, 2009
Posts: 189
Do you really need all rows in memory at the same time? Can't you devide the data into some logically connected parts which are independent? How much memory do you have? How many fields does your table have ans how big are they? I think there shouldn't be a problem to hold 100K rows.
If you don't need everything in memory you can use scrollable resultsets. And don't forget to flush() and clear() session from time to time (100 to 1000 rows).
manish ahuja
Ranch Hand

Joined: Oct 23, 2003
Posts: 312
@Cameron
The batch functionality is an extension of our current on-demand java application. The java app contains the business processing needed for the mentioned records part of the batch op. So it's inevitable to use/invoke the java app in a batch.

@Christian
No there is no need to have all the 100k records in memory at the same time.
Another approach to get around the memory size issue (approach1) is to divide the entire batch program into smaller chunks say 10k records at a time, load it and process. This works well but we limit the resultset to size of 10k [criteria.maxresults(10000)] in one run and needs executing the batch utility 10 or more times. In this case how can I ensure how to size the loop of 10 or more or less batch routines. Can you elaborate on the Scrollable results set suggestion.



Benjamin Winterberg
Ranch Hand

Joined: Sep 19, 2007
Posts: 36
Take a look at this: http://docs.jboss.org/hibernate/stable/core/reference/en/html/batch.html


Hibernate DAOs with Spring | Separation of Concerns | Eclipse Code Templates | more...
 
Don't get me started about those stupid light bulbs.
 
subject: hibernate in a batch program suggestion