aspose file tools*
The moose likes Hadoop and the fly likes Process a file using Hadoop Map Reduce without proper End of Line. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Process a file using Hadoop Map Reduce without proper End of Line." Watch "Process a file using Hadoop Map Reduce without proper End of Line." New topic
Author

Process a file using Hadoop Map Reduce without proper End of Line.

Satyaprakash Joshii
Ranch Hand

Joined: Jun 18, 2012
Posts: 140
Hadoop Mapper reads each line as value. I processed a file having username,comments e.t.c Each usernma,comments e.tc wre in a separate line. I processed it successfully to extract the comments and do required manipulation.

Now, I have to process a file in which each line is not in a separate line.i.e line breaks are not regular.Can you advice me how to process this file as Hadoop Mapper read each line as a value now if there is no proper end of line how to process it.

Thanks.
Srinivas Mupparapu
Greenhorn

Joined: Feb 12, 2004
Posts: 14

MapReduce uses a record reader behind the scenes which by default reads one line at a time. You can override this behviour using a customre record reader and take control of what constitutes a record. Look into org.apache.hadoop.mapred.RecordReader interface. There are several implementations of this interface available out of the box.
 
Gartner says :Bigdata will be most advanced analytics products by 2015 !

Time to Become Big data architect by learning Hadoop(Developer, Administration,Analyst,QA),Cassandra,MongoDb,HBase,Datascience, Mahout, Splunk,R etc) from scratch to expert level

https://intellipaat.com/course-cat/big-data/?utm_source=coderanch%20&utm_medium=text&utm_campaign=coderanchdx1
 
subject: Process a file using Hadoop Map Reduce without proper End of Line.