This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Hi guys : for those of you that use hadoop --- do you manage your data directly ? Or do you just dump it all in hbase ?
- In general most people I hear of using hadoop are doing it to store millions or billions of records for map/reduce
- The Hadoop m/r api can directly read/write to hbase tables.
- It seems like storing results of m/r jobs in programatically created folders, rather than in a single, machine-managed map (like hbase) is an errand which might be prone to errors in subsequent read and data cleaning stages which might occur.