Win a copy of Learn Spring Security (video course) this week in the Spring forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Java code for extracting data from a HTML Table from a web page

 
Nandu Vajjala
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi
We need some pointers on how do we extract data from a HTML Table from a web page using a Java program.
For instance: http://www.fsa.gov.uk/ukla/hcaList.do

Above link has a table in the below format

Company name Country of Incorporation Home member state
3I INFRASTRUCTURE PLC CHANNEL ISLANDS UNITED KINGDOM
888 HOLDINGS PLC GIBRALTAR UNITED KINGDOM

We need to extract the data and convert it to a csv format file.

Thanks
Anand Vardhan


 
Amit Vinod Dali
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can use HTML Parsers for this requirement as;
Open Source HTML Parsers in Java
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The easiest would be to use a library like HtmlUnit which provides various ways to extract data from a web page (and also handles the HTTP communication, cookies, etc.).
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic