Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

running java app as a batch job

 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there,

Currently I have a pure java application using core java which interact with backend database using an OR mapping layer(hibernate). This is synchronous invocation.
Now I want to run this application in a batch. i.e. the input for the application will be in a file so instead of 1 record processing invoked by an external client I will have a file which may consist 1000 records. I have to pick 1 record each from the file and invoke the application.
I want to know your ideas on the same and best practices if any related to this.
Like some of the issues to tackle is not allowing more than 1 batch to run at the same time. We may have to change the file based approach mentioned above to say do a database call and fetch records to be feeded into the java app.
Can we use some of the batch utilities in the java space like Springbatch or quartz.

Thanks in Advance
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34401
346
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Manish,
You could write a wrapper than calls your app for each line of the file. This is likely to be significantly slower than doing batching within the program.

quartz is a scheduler. You'd still need something to divy up the work.
 
Rene Larsen
Ranch Hand
Posts: 1179
Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
We may have to change the file based approach mentioned above to say do a database call and fetch records to be feeded into the java app.

You could also use a JMS queue or topic, which your application then have a receiver for.
Each time a new message arrive on the queue/topic your application will automatically get the message.

Can we use some of the batch utilities in the java space like Springbatch or quartz.

You could maybe also use a Windows Service (Windows) or Cron Job (UNIX/Linux/Mac/etc.), to run your application.
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Jeanne,

-------------
You could write a wrapper than calls your app for each line of the file. This is likely to be significantly slower than doing batching within the program.
--------------
Can you explain what do you mean by "than doing batching within the program".

-Manish
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34401
346
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by manish ahuja:
Can you explain what do you mean by "than doing batching within the program".

Sure. If your Java program read in the file of input values, it could read in X rows at a time (let's say 100) and do a batch update to the database. This cuts down on the number of database round trips.
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Jeanne,

My original intent of the question was can I use something like sub-batches within the parent batch job so as to expedite the whole flow run.
Say If get an input of 100 records (file read or database read), I will typically do a for or while loop iterate every input record and run the application for each such input record.
This would really take long as the volume of input increases. Each input record is independent of the other. With this regard I was wondering if there is anything where I could start multiple such jobs in parallel and accelerate the batch run time. I am not sure multi-threading can greatly help in this situation.

Manish
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34401
346
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Manish,
Now I'm confused. Your original post says "Like some of the issues to tackle is not allowing more than 1 batch to run at the same time. " so I thought you didn't want to parallelize.

If the jobs are independent, there's no reason you couldn't kick off multiple ones at the same time.

Is it safe to assume you've tuned the original job? If not, this often yields more than threading because each update requires less resources.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic