File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Hadoop and the fly likes Is Hadoop ready for the Enterprise? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Is Hadoop ready for the Enterprise?" Watch "Is Hadoop ready for the Enterprise?" New topic

Is Hadoop ready for the Enterprise?

Carlos Morillo
Ranch Hand

Joined: Jun 06, 2009
Posts: 221

Hadoop has been around as an open source project for barely 7 years.

Would you recommend a customer (Let's say an Investment Bank in Wall Street) who needs to run a mission critical application any specific Hadoop distribution?

Is this Hadoop distribution capable of NameNode HA, JobTracker HA, Volumes, Snapshots, Mirrors and any other features important for disaster recovery?

In my view ease of use, ease to make the Data Ingestion in the Hadoop cluster filesystem are important critical features to have.

Garry Turkington

Joined: Apr 23, 2013
Posts: 15
I really tend to avoid recommending a particular distribution as I think they all have a place. But if you take your list of ideal requirements then it's clear that given the current state of the underlying Apache projects that MapR is probably the best fit.

Let's be honest, prior to Hadoop 2.0 HA (particularly for the NameNode) has always been compromised to a degree. The system is near bullet-proof when most things fail, but have your NN go down and you are in trouble. Hadoop 2.0 improves that greatly and it'll be pretty cool to have all the major distributions offering out-of-the-box HA for both NN and JT.

But I'd also caution that DR is absolutely more than the choice of distribution and I think you touch on that. Whatever setup you choose fate will always find a failure scenario that causes some sort of operational crises. Lightning strikes are particularly good at highlighting these. And if you do need things like complete cross-site redundancy I suspect you'll end up building sufficient plumbing to make it all work that the choice of distribution and particular features is less relevant.

I think it's true to say that this sort of high-end cross-site DR is another area that Hadoop will continue to mature in but I'd also say that given previous experiences trying to implement other technologies that supposedly do have that level of DR are never as simple as the vendor says and this sort of thing is just fundamentally hard.

I agree. Here's the link:
subject: Is Hadoop ready for the Enterprise?
It's not a secret anymore!