File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Hadoop and the fly likes Hadoop and distributed transactional cache Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop and distributed transactional cache" Watch "Hadoop and distributed transactional cache" New topic

Hadoop and distributed transactional cache

Vladimir Shcherbina

Joined: Nov 15, 2005
Posts: 22

How Hadoop related to distributed transactional cache implementations (like IBM ObjectGrid, etc.)?
Also, is Java the only language of choice or something like Erlang can be more suitable for this purpose?
David Newton

Joined: Sep 29, 2008
Posts: 12617

Of course Java isn't the only choice, but it's *a* choice. I don't think Erlang is necessarily "more" suitable, especially with the current focus on distributed solutions in a host of other languages, including Java, and other JVM solutions.
Tibi Kiss
Ranch Hand

Joined: Jun 11, 2009
Posts: 47
Hadoop has also its own cache implementation, which stands for bringing big data at the right place in the right time, but is not a transactional cache.

Distributed transactional cache are used in transactional systems to communicate states between the part of the processes running in diferent nodes but sharing the same transactional scope.
While MapReduce framework is not suitable or counterproductive to implement algorithm where you always have to share a common state data in a transactional manner. For example a Montecarlo based algorithm usually needs such a shared state, that's why MapReduce framework is not quite adequate for that category of algorithms. If you augment a distributed transactional cache to your MapReduce framework, then of course it will do the job with Montecarlo algorithms, but scalability will suffer as in any distributed transactional systems.
MapReduce framework is best suited for algorithm which does not require shared state between computational nodes, that's why the scalability of MapReduce programming model is so good. Luckily there are many areas of large scale computational algorithms where a MapReduce framework as it is implemented in Hadoop is suitable naturally.
I agree. Here's the link:
subject: Hadoop and distributed transactional cache
It's not a secret anymore!