permaculture playing cards*
The moose likes XML and Related Technologies and the fly likes reading a website data to build a dashboard Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "reading a website data to build a dashboard" Watch "reading a website data to build a dashboard" New topic
Author

reading a website data to build a dashboard

rammie singh
Ranch Hand

Joined: Mar 26, 2009
Posts: 116
Hi guys ..
i want to build a dsah board after fetching data from a website.

means seeing a website page content i want to develop an .xml file which stores tha data of that web page and that .xml file is to be used to generate a dashboard to generate a report.

say a web page is displaying various information on movies.....say it's revenue, it's production cost and all

Now i want to store the top 10 movies from that page (on any selective criteria say ..revenue) in an .xml file. that xml file willl begenerated at my machine...and will be send to create a dashboard report.

questiion

So my question is is it possible....i mean how can i read a web page content and store it in a .xml file..and if it is posssible what is the way to do this....and if it is possible how can we create that .xml file by reading the web page..
please help me with suitable way..
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41863
    
  63
There's no natural mapping from an HTML page to XML, so you'll need to code that yourself. I'd approach this using a library like HtmlUnit that makes it easy to access a web site programmatically. It cleans the HTML so it becomes well-formed XML, and then presents a DOM and XPath interface that you can use to extract whichever parts of the page you're interested in.


Ping & DNS - my free Android networking tools app
rammie singh
Ranch Hand

Joined: Mar 26, 2009
Posts: 116
Hi Ulf

thanks for your response.
well you said that we can use library like HtmlUnit...so is this library already present or we need to create it.
or is there any tool to read the contents of a web page .
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41863
    
  63
A quick search for "htmlunit" will answer that.
James Ward
Ranch Hand

Joined: Apr 27, 2003
Posts: 263
You can fetch the html page.
Run regular expressions on it, to extract data.
And the create xml out of the data.

This was the approach that a few web-content aggregation products used to follow.
Sean Clark
Rancher

Joined: Jul 15, 2009
Posts: 377

Hey,

I have also used a library called JTidy to get XML from HTML to allow me to extract data.

Sean


I love this place!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: reading a website data to build a dashboard