• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Liutauras Vilda
Sheriffs:
  • Rob Spoor
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
  • Scott Selikoff
Bartenders:
  • Piet Souris
  • Jj Roberts
  • fred rosenberger

java batch job heap memory space problem

 
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,

I am currently encountering an issue with java heap memory space. I reckon the reason for the same is the enormous size of data processing we are trying to achieve in the java batch utility program of ours.
The java batch program works something grabs a list of data objects from the underlying datastore using the Hibernate OR mapping framework and captures in a List object.
Once it fetches this collection of data objects it starts processing each of them. The business logic is actually working on each of these retrieved data object.
The problem is with number of objects retrieved which is gigantic over 200k. No matter how much ever memory size I try to increase it will always fall short.

the order of processing data objects is not important and we can even process them in parallel as there is no interdependency between two data objects.

I would like to know your thoughts on how the best possible strategy to handle the above mentioned scenario.

Regards,
 
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Manish:

Is it possible to change the way you get your data from the data store? Instead of grabbing all 200K objects at once and then processing them, can you change it to grab 200, or maybe 2,000, process them, and then grab another batch?

John.
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi John,

Are you saying like grab the first 200 data objects, finish the processing and then the grab the next 200 and so on.
There is nothing like this in the current implementation but this would mean incorporating logic to monitor the status of every sub batch (200 data objects) and then fire a fresh request to fetch the subsequent data objects.

Are there any best practices in partitioning/parallel processing which can be leveraged here.

Thanks,

 
John de Michele
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Manish:

Yes, that sounds right. You could probably partition the processing to different threads (~1 per CPU) versus processing serially, which would probably give you a performance boost.

John.
 
pie. tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic