wood burning stoves 2.0*
The moose likes Web Services and the fly likes Java code for extracting data from a HTML Table from a web page Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Web Services
Reply locked New topic
Author

Java code for extracting data from a HTML Table from a web page

Nandu Vajjala
Greenhorn

Joined: Nov 12, 2005
Posts: 8
Hi
We need some pointers on how do we extract data from a HTML Table from a web page using a Java program.
For instance: http://www.fsa.gov.uk/ukla/hcaList.do

Above link has a table in the below format

Company name Country of Incorporation Home member state
3I INFRASTRUCTURE PLC CHANNEL ISLANDS UNITED KINGDOM
888 HOLDINGS PLC GIBRALTAR UNITED KINGDOM

We need to extract the data and convert it to a csv format file.

Thanks
Anand Vardhan


Praveen mourya Kumar
Greenhorn

Joined: Oct 14, 2008
Posts: 16
Hi,

your problem can be solve by using the webcrawler in Java. for more help, please try to visit the following link :
http://java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/
Good Luck


SCJP 1.6..
Alec Lee
Ranch Hand

Joined: Jan 28, 2004
Posts: 569
You should better use Javascript to extract the HTML data. Using Java means you program needs to act as an HTML client. Although open source solution like Jakarta's HttpClient already existing, Javascript is much better choice as the browser already support it. In particular, you will probably need to use HTA (HTML Application) file (see MSDN for it, basically a HTML file with embedded javascript renamed to .hta).
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14688
    
  16

Do not duplicate threads. Closing this one. Continue there.


[My Blog]
All roads lead to JavaRanch
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Java code for extracting data from a HTML Table from a web page