Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
The moose likes XML and Related Technologies and the fly likes Html to Java Object Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Html to Java Object" Watch "Html to Java Object" New topic

Html to Java Object

Mark Spritzler

Joined: Feb 05, 2001
Posts: 17276

So, I am sure there has to be an easy way and it must be possible, but I want to scrap a small piece of information from a web page. The page is html obviously, and with XPath I should be able to get to the exact element in that page, and then I want the value in there, and automatically create a simple Java Value Object to hold that piece of information.

I want to use Jaxb 2 with an annotation on the Value Object class, so I can just call an unmarshaller to get the data from the HTML to the Java object.

Does anyone have a good link to an example code like this?



Perfect World Programming, LLC - iOS Apps
How to Ask Questions the Smart Way FAQ
Bear Bibeault
Author and ninkuma

Joined: Jan 10, 2002
Posts: 63866

Be aware that HTML is not XML. If your source is XHTML, then you'll have an easier time of things as the markup will be well-formed (if correct).

I haven't had this need in quite some time, but way back when there seemed to be plenty of 3rd-party libraries out there to scrape HTML.

[Asking smart questions] [About Bear] [Books by Bear]
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
Not sure if it satisfies your definition of "Java Value Object", but both TagSoup and NekoHTML can make a DOM object out of HTML (after regularizing it). So you would get a Node or Element object to play with.
I agree. Here's the link:
subject: Html to Java Object
It's not a secret anymore!