(Mods if you think this should be in the Hadoop forum, or both, please move)
I wouldn't say it was hard per se, but Hadoop has a lot of funky concepts.
For preparing, "24 hadoop interview questions" on the net is a great starting point. I then went through Tom White's Hadoop the Definitive Guide 3rd Ed, chapters 2-8 completely, typing out full notes and reading them constantly. For the other parts of the ecosystem, you need a high level overview with the basics of how they work and what they can do that Hadoop can't.
On my laptop, played with local-mode in Eclipse plus pseudo-mode with jar files. Tried lots of variations with a simple wordcount program:
Zero reducers, identity reducer, identity for both, combiners added
Logging and counter techniques
Made a chart of the 5-6 ways to set a property value and which took priority
List of the default conditions for things you don't specify (Hadoop can run given just IO paths).
Made a list of HDFS commands and kept playing around with them
Studied mock questions on the net, some had wrong answers, so be careful. Was going to purchase from HadoopExam.com, but they required submitting my MS Product Key, and I wasn't willing to do that. A shame, they advertised 200 questions for $45.
Took me a couple months of hardcore studying until it all became clear. The book took the most time, reading about 270 pages twice and doing notes.
In the future, I plan to also do the Cloudera certs for HBase, Hadoop admin, and Data Scientist. Will also take the Coursera Scala class with comes with a signed certificate. And maybe check into Apache Cassandra (no cert yet from what I could tell).
I am in the process of taking the Coursera class called "Introduction to Data Science". We got to practive hadoop on Amazon's platform.
You would probably like this class if it it comes around again.
Joined: Dec 05, 2012
Thanks for the heads up, I sure wish there were more hours in a day...
I have also cleared the Hadoop Developer as well as Administrator exam in first attempt and cloudera exam should not be taken light, because of its cost as well as level of the questions. Please find below whatever I had followed to prepare the exam. However, I want to clarify here that www.HadoopExam.com do not ask any Microsoft (MS) keys for simulator. Once you install the trial version of their simulator on your computer, their software generate some unique key, which they need it to identify uniqueness of the machine to avoid any theft of the software and I don't see anything wrong with that.
I passed both CCAH & CCDH and here is what I did in that order.
1)www.HadoopExam.com Training Videos and Certification Simulator
Their Videos are very well designed for the core understanding of Hadoop Architecture. The lectures are very precise yet comprehensive, which cuts learning time significantly, besides providing a definite edge in certification exam and job interviews. Trainer did a great job in delivering the essential material in such a concise and effective manner that gives the learner very good foundation in Hadoop framework.
I'm in the middle of the class now. It's pretty interesting, the teacher approaches things from a different angle and likes to discuss side issues. The first programming assignment was a simple map-reduce using a python framework, which is always a sweet language to play with.
Sorry about the bum steer on the product key, their website could be a little clearer though.