wood burning stoves 2.0*
The moose likes Hadoop and the fly likes MapReduce vs Distributed task Queue Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "MapReduce vs Distributed task Queue" Watch "MapReduce vs Distributed task Queue" New topic
Author

MapReduce vs Distributed task Queue

Zaharie Sergiu
Greenhorn

Joined: Jul 24, 2012
Posts: 1
Hello all,

Can someone make a clear difference between this 2 concepts, when is better to build a distributed system with MapReduce (Hadoop) or Distributed Task Queue (Celery)?
With respect to performance, load balancing, big data, scalability, reliability, availability, efficiency, what can be a drawback or advantage of using one or another?

I am currently in the research phase of a project, which consist in building a web based distributed system. I have an initial text mining software which I need to decompose it in order to integrate it with one of this 2 frameworks and make it distributed.


Thank you!
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

OK, here is an answer that isn't really direct.

the answer

It Depends.

It depends on what task, process you are doing. There are some tasks that you want run distributed, but doesn't fit well into MapReduce and some that do. Typical Hadoop example of reading many files and counting words is a great example of Map Reduce. getting results for a search like Google is great example for MapReduce. Handling Events via Messaging and processing the data doesn't need MapReduce and distributed tasks would be better.

It depends on the particular use case.

Mark


Perfect World Programming, LLC - Two Laptop Bag - Tube Organizer
How to Ask Questions the Smart Way FAQ
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: MapReduce vs Distributed task Queue