I'm building a multi-threaded Java application which in simple terms reads data from the database, does some processing and writes it into a file. We're planning to distribute among the 3 UNIX servers which we have. Just need some inputs on any java frameworks which would be best for this.
Question: Are you absolutely sure that you need to distribute the processing workload?
Out of those three tasks, reading from a DB, doing some processing, and writing to a file, I wouldn't immediately consider that the processing part is going to be the bottleneck in the whole operation. Reading from a DB is slow and writing to a file is really slow. What you need to do here is to get it working on a single machine and then profile the performance of the operation to find out, for a fact, where the bottleneck is. Only then can you know for sure what you can do to speed things up.
Question: Are you sure Java is the right tool for the job?
If the processing part is just some text manipulation then I might recommend you use a scripting language to do that task. Languages like Groovy or Python might be better suited. Particularly Groovy if you're used to writing Java as the syntax is pretty similar.
Tim Driven Development
Joined: Jun 17, 2014
Hi Tim, Thanks for the response.
We're currently having this process in Perl but the amount of data to be processed is growing (currently in millions) so we thought instead of reading from the database sequentially, we can read in parallel and process the data.
Our idea of distributing this process is to evenly balance the load on our server as some of our servers are under utilized.
Assuming you've done all your profiling and have concluded that the processing part is slowing you down then I'm afraid this is where my usefulness ends. I'm not that familiar with writing distributed java apps and haven't used any frameworks for it either.
Joined: Jun 17, 2014
Thanks Tim. Anybody has any thoughts on this. Please let me know.
Since Sun's motto was "the network is the computer," Java has plenty of resources for network computing - personally I found the JavaSpaces concept rather attractive. Search for "Javaspaces open source" or "Gigaspaces"
Another approach might use JMS - Java Message Service.
Joined: Mar 22, 2005
I found the JavaSpaces concept rather attractive.
These days, the Jini and JavaSpaces projects are carried on as part of Apache River.