aspose file tools*
The moose likes Hadoop and the fly likes Hadoop unziping to processing xml files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop unziping to processing xml files" Watch "Hadoop unziping to processing xml files" New topic

Hadoop unziping to processing xml files

Rahul Mahindrakar
Ranch Hand

Joined: Jul 28, 2000
Posts: 1836

I have tar files which contains text files with xml like

I am starting out with working with Hadoop and would need some high level knowledge how I should go about go about doing this

1) How do I scp the files over to where I can provide them to Hadoop. Is there some component or framework
2) How to untar the file once it is received. I think i have googled and there are some components. But has someone over here some prior experience.
3) How to convert multiple line Text + xml into single line for me to process like
4) HOw to now process this line. Should I process it as text or XML. I guess for beginners text is ok

I just need some ideas.

Rahul M.

I agree. Here's the link:
subject: Hadoop unziping to processing xml files
Similar Threads
Process multiple node occurences using JAXB
XPath Problem
XPath Problem
XPath Problem
Problem with processing data files of size larger than 350 MB