aspose file tools
The moose likes Hadoop and the fly likes Hadoop unziping to processing xml files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Elasticsearch in Action this week in the Big Data forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop unziping to processing xml files" Watch "Hadoop unziping to processing xml files" New topic

Hadoop unziping to processing xml files

Rahul Mahindrakar
Ranch Hand

Joined: Jul 28, 2000
Posts: 1861

I have tar files which contains text files with xml like

I am starting out with working with Hadoop and would need some high level knowledge how I should go about go about doing this

1) How do I scp the files over to where I can provide them to Hadoop. Is there some component or framework
2) How to untar the file once it is received. I think i have googled and there are some components. But has someone over here some prior experience.
3) How to convert multiple line Text + xml into single line for me to process like
4) HOw to now process this line. Should I process it as text or XML. I guess for beginners text is ok

I just need some ideas.

Rahul M.

Gartner says :Bigdata will be most advanced analytics products by 2015 !

Time to Become Big data architect by learning Hadoop(Developer, Administration,Analyst,QA),Cassandra,MongoDb,HBase,Datascience, Mahout, Splunk,R etc) from scratch to expert level
subject: Hadoop unziping to processing xml files