File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

reading a website data to build a dashboard

 
rammie singh
Ranch Hand
Posts: 116
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi guys ..
i want to build a dsah board after fetching data from a website.

means seeing a website page content i want to develop an .xml file which stores tha data of that web page and that .xml file is to be used to generate a dashboard to generate a report.

say a web page is displaying various information on movies.....say it's revenue, it's production cost and all

Now i want to store the top 10 movies from that page (on any selective criteria say ..revenue) in an .xml file. that xml file willl begenerated at my machine...and will be send to create a dashboard report.

questiion

So my question is is it possible....i mean how can i read a web page content and store it in a .xml file..and if it is posssible what is the way to do this....and if it is possible how can we create that .xml file by reading the web page..
please help me with suitable way..
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There's no natural mapping from an HTML page to XML, so you'll need to code that yourself. I'd approach this using a library like HtmlUnit that makes it easy to access a web site programmatically. It cleans the HTML so it becomes well-formed XML, and then presents a DOM and XPath interface that you can use to extract whichever parts of the page you're interested in.
 
rammie singh
Ranch Hand
Posts: 116
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ulf

thanks for your response.
well you said that we can use library like HtmlUnit...so is this library already present or we need to create it.
or is there any tool to read the contents of a web page .
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A quick search for "htmlunit" will answer that.
 
James Ward
Ranch Hand
Posts: 263
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can fetch the html page.
Run regular expressions on it, to extract data.
And the create xml out of the data.

This was the approach that a few web-content aggregation products used to follow.
 
Sean Clark
Rancher
Posts: 377
Android Java Spring
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey,

I have also used a library called JTidy to get XML from HTML to allow me to extract data.

Sean
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic