GeeCON Prague 2014*
The moose likes Java in General and the fly likes Java code for extracting data from a HTML Table from a web page Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Java in General
Bookmark "Java code for extracting data from a HTML Table from a web page" Watch "Java code for extracting data from a HTML Table from a web page" New topic
Author

Java code for extracting data from a HTML Table from a web page

Nandu Vajjala
Greenhorn

Joined: Nov 12, 2005
Posts: 8
Hi
We need some pointers on how do we extract data from a HTML Table from a web page using a Java program.
For instance: http://www.fsa.gov.uk/ukla/hcaList.do

Above link has a table in the below format

Company name Country of Incorporation Home member state
3I INFRASTRUCTURE PLC CHANNEL ISLANDS UNITED KINGDOM
888 HOLDINGS PLC GIBRALTAR UNITED KINGDOM

We need to extract the data and convert it to a csv format file.

Thanks
Anand Vardhan


Amit Vinod Dali
Ranch Hand

Joined: Apr 14, 2010
Posts: 42
You can use HTML Parsers for this requirement as;
Open Source HTML Parsers in Java
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42031
    
  64
The easiest would be to use a library like HtmlUnit which provides various ways to extract data from a web page (and also handles the HTTP communication, cookies, etc.).


Ping & DNS - my free Android networking tools app
 
GeeCON Prague 2014
 
subject: Java code for extracting data from a HTML Table from a web page