posted 11 years ago
Hi all,
Here is the solution. Use Nutch api to extract the data.Under crawl/segment folder it placed the content,parsed text,parsed data etc.
Sample code to read data from hadoop file system using Nutch 1.6 api
All search starts with beginner's luck and all search ends with victor's severly tested.