jQuery in Action, 3rd edition
The moose likes Distributed Java and the fly likes Best Framework for Java Grid Computing Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA Java SE 8 Programmer I Study Guide this week in the OCAJP 8 forum!
JavaRanch » Java Forums » Java » Distributed Java
Bookmark "Best Framework for Java Grid Computing" Watch "Best Framework for Java Grid Computing" New topic
Author

Best Framework for Java Grid Computing

Ahamed Shamshudeen
Greenhorn

Joined: Jun 17, 2014
Posts: 4
Hi All,

I'm building a multi-threaded Java application which in simple terms reads data from the database, does some processing and writes it into a file. We're planning to distribute among the 3 UNIX servers which we have. Just need some inputs on any java frameworks which would be best for this.

Thanks in advance.
Tim Cooke
Sheriff

Joined: Mar 28, 2008
Posts: 2242
    
101

Hello Ahamed, welcome to The Ranch!

Question: Are you absolutely sure that you need to distribute the processing workload?

Out of those three tasks, reading from a DB, doing some processing, and writing to a file, I wouldn't immediately consider that the processing part is going to be the bottleneck in the whole operation. Reading from a DB is slow and writing to a file is really slow. What you need to do here is to get it working on a single machine and then profile the performance of the operation to find out, for a fact, where the bottleneck is. Only then can you know for sure what you can do to speed things up.

Question: Are you sure Java is the right tool for the job?

If the processing part is just some text manipulation then I might recommend you use a scripting language to do that task. Languages like Groovy or Python might be better suited. Particularly Groovy if you're used to writing Java as the syntax is pretty similar.


Tim Driven Development
Ahamed Shamshudeen
Greenhorn

Joined: Jun 17, 2014
Posts: 4
Hi Tim, Thanks for the response.

We're currently having this process in Perl but the amount of data to be processed is growing (currently in millions) so we thought instead of reading from the database sequentially, we can read in parallel and process the data.

Our idea of distributing this process is to evenly balance the load on our server as some of our servers are under utilized.
Tim Cooke
Sheriff

Joined: Mar 28, 2008
Posts: 2242
    
101

Assuming you've done all your profiling and have concluded that the processing part is slowing you down then I'm afraid this is where my usefulness ends. I'm not that familiar with writing distributed java apps and haven't used any frameworks for it either.
Ahamed Shamshudeen
Greenhorn

Joined: Jun 17, 2014
Posts: 4
Thanks Tim. Anybody has any thoughts on this. Please let me know.

Thanks
Ulf Dittmer
Rancher

Joined: Mar 22, 2005
Posts: 42958
    
  73
Assuming that you don't want to go the Hadoop route, I think JPPF might be a good choice.
Ahamed Shamshudeen
Greenhorn

Joined: Jun 17, 2014
Posts: 4
Ulf Dittmer wrote:Assuming that you don't want to go the Hadoop route, I think JPPF might be a good choice.


Thanks Ulf. I thought that Hadoop is only used for Big Data. If Hadoop can get the job done effectively, could you please provide pointers to some resources on it.
Ulf Dittmer
Rancher

Joined: Mar 22, 2005
Posts: 42958
    
  73
You mean aside from the Hadoop web site and a web search for "hadoop tutorial/introduction"? No idea
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 13018
    
    5
Since Sun's motto was "the network is the computer," Java has plenty of resources for network computing - personally I found the JavaSpaces concept rather attractive. Search for "Javaspaces open source" or "Gigaspaces"

Another approach might use JMS - Java Message Service.

Bill
Ulf Dittmer
Rancher

Joined: Mar 22, 2005
Posts: 42958
    
  73
I found the JavaSpaces concept rather attractive.

These days, the Jini and JavaSpaces projects are carried on as part of Apache River.
Praful Thakare
Ranch Hand

Joined: Feb 10, 2001
Posts: 641
may be http://hazelcast.com/ can help.


All desirable things in life are either illegal, banned, expensive or married to someone else !!!
Emma Jones
Greenhorn

Joined: Oct 02, 2013
Posts: 2
I tried tayzgrid lately. it is new but very efficient. http://www.tayzgrid.com/
Simon Roberts
Author
Ranch Hand

Joined: Oct 24, 2000
Posts: 78
    
    6

Java's built in RMI does this well out of the box. If you want to add more features, there's a bunch of stuff for dynamic service discovery, distributed event handling, and even transactions (and actually "Java Spaces" formerly mentioned in this thread) in what is now the Apache River project. It used to be a Sun project called Jini. Both RMI and Jini are what I like to describe as "real distributed object-oriented" By that I mean you can pass polymorphic arguments across calls, passing objects of classes never previously seen by the recipient (and yes, the security manager must be in place, so the newly introduced code can be prevented from running amok) You can have instances of distributed/remotely accessible objects created dynamically (not just "this is the server") and basically _everything_ that is normal for OO in a single process space.

HTH
Simon

author of: [OCA Java SE7 (Video) Course] and [Java Programming Basics (Video)]
guillermo urdaneta
Ranch Hand

Joined: Jun 09, 2007
Posts: 39
You need to try Apache Storm, to do parallel processing, the other option you have it is to write a good java app and it need to work with a lot of multi threading on it, using Executor interface, but if i were you i try Apache Storm.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Best Framework for Java Grid Computing
 
It's not a secret anymore!