Mohamed, as an alternative to Hadoop, take a look at the HPCC Systems platform. Designed by data scientists, it provides for a single architecture, a consistent data-centric programming language (ECL), and two data processing clusters. Their built-in analytics libraries for Machine Learning and BI integration provide a complete integrated solution from data ingestion and data processing to data delivery. This all in one platform means only one thing to support and from a significant lower number of resources. In contrast, the complexity of the Hadoop ecosystem requires a huge investment in technology and resources up front and throughout. The inherent parallelism and data flow nature of the ECL language removes the worry about trying to parallelize my jobs, as was the case in my experience with Hadoop MapReduce. In fact, I have to say ECL is somewhat similar to SQL from the perspective both are declarative data programming languages. So if you are a good SQL developer, ECL should be a breeze to understand and use. More at http://hpccsystems.com. HPCC also has a connector for Hadoop data. In fact, a webhdfs implementation, (web based API provided by Hadoop) was recently released. Specific info at http://hpccsystems.com/h2h
Joined: Dec 08, 2009
Thanks Azana. It looks that HPCC worth to look at ...