This week's book giveaways are in the Refactoring and Agile forums.
We're giving away four copies each of Re-engineering Legacy Software and Docker in Action and have the authors on-line!
See this thread and this one for details.
Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Parsing Webpage and Links

 
maverickml venkatesh
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Sorry if I had posted this in the wrong forum.

I have a webpage with links in a table. All i need to do is to navigate to each of these links and gather information from those pages.

As of now i can think of writing web crawler in java (again this has to be supported by the site)

Any other solutions like collecting the webpage response and HTML parsing the same.

Please suggest
 
Tim Moores
Bartender
Posts: 2687
36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The HttpUnit library (or jWebUnit, which builds on top of it) is perfect for that.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic