• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Best Framework for Java Grid Computing

 
Ahamed Shamshudeen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I'm building a multi-threaded Java application which in simple terms reads data from the database, does some processing and writes it into a file. We're planning to distribute among the 3 UNIX servers which we have. Just need some inputs on any java frameworks which would be best for this.

Thanks in advance.
 
Tim Cooke
Sheriff
Pie
Posts: 2885
121
Clojure IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Ahamed, welcome to The Ranch!

Question: Are you absolutely sure that you need to distribute the processing workload?

Out of those three tasks, reading from a DB, doing some processing, and writing to a file, I wouldn't immediately consider that the processing part is going to be the bottleneck in the whole operation. Reading from a DB is slow and writing to a file is really slow. What you need to do here is to get it working on a single machine and then profile the performance of the operation to find out, for a fact, where the bottleneck is. Only then can you know for sure what you can do to speed things up.

Question: Are you sure Java is the right tool for the job?

If the processing part is just some text manipulation then I might recommend you use a scripting language to do that task. Languages like Groovy or Python might be better suited. Particularly Groovy if you're used to writing Java as the syntax is pretty similar.
 
Ahamed Shamshudeen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Tim, Thanks for the response.

We're currently having this process in Perl but the amount of data to be processed is growing (currently in millions) so we thought instead of reading from the database sequentially, we can read in parallel and process the data.

Our idea of distributing this process is to evenly balance the load on our server as some of our servers are under utilized.
 
Tim Cooke
Sheriff
Pie
Posts: 2885
121
Clojure IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Assuming you've done all your profiling and have concluded that the processing part is slowing you down then I'm afraid this is where my usefulness ends. I'm not that familiar with writing distributed java apps and haven't used any frameworks for it either.
 
Ahamed Shamshudeen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Tim. Anybody has any thoughts on this. Please let me know.

Thanks
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Assuming that you don't want to go the Hadoop route, I think JPPF might be a good choice.
 
Ahamed Shamshudeen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ulf Dittmer wrote:Assuming that you don't want to go the Hadoop route, I think JPPF might be a good choice.


Thanks Ulf. I thought that Hadoop is only used for Big Data. If Hadoop can get the job done effectively, could you please provide pointers to some resources on it.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You mean aside from the Hadoop web site and a web search for "hadoop tutorial/introduction"? No idea
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13055
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Since Sun's motto was "the network is the computer," Java has plenty of resources for network computing - personally I found the JavaSpaces concept rather attractive. Search for "Javaspaces open source" or "Gigaspaces"

Another approach might use JMS - Java Message Service.

Bill
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I found the JavaSpaces concept rather attractive.

These days, the Jini and JavaSpaces projects are carried on as part of Apache River.
 
Praful Thakare
Ranch Hand
Posts: 642
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
may be http://hazelcast.com/ can help.
 
Emma Jones
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I tried tayzgrid lately. it is new but very efficient. http://www.tayzgrid.com/
 
Simon Roberts
Author
Ranch Hand
Posts: 158
7
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Java's built in RMI does this well out of the box. If you want to add more features, there's a bunch of stuff for dynamic service discovery, distributed event handling, and even transactions (and actually "Java Spaces" formerly mentioned in this thread) in what is now the Apache River project. It used to be a Sun project called Jini. Both RMI and Jini are what I like to describe as "real distributed object-oriented" By that I mean you can pass polymorphic arguments across calls, passing objects of classes never previously seen by the recipient (and yes, the security manager must be in place, so the newly introduced code can be prevented from running amok) You can have instances of distributed/remotely accessible objects created dynamically (not just "this is the server") and basically _everything_ that is normal for OO in a single process space.

HTH
Simon
 
guillermo urdaneta
Ranch Hand
Posts: 39
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need to try Apache Storm, to do parallel processing, the other option you have it is to write a good java app and it need to work with a lot of multi threading on it, using Executor interface, but if i were you i try Apache Storm.
 
Consider Paul's rocket mass heater.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic