This week's book giveaway is in the Big Data forum.
We're giving away four copies of Elasticsearch in Action and have Radu Gheorghe & Matthew Lee Hinman on-line!
See this thread for details.
The moose likes Java in General and the fly likes How would be the best way to parse HTML Content ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Elasticsearch in Action this week in the Big Data forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "How would be the best way to parse HTML Content ?" Watch "How would be the best way to parse HTML Content ?" New topic
Author

How would be the best way to parse HTML Content ?

Kiran Shirali
Ranch Hand

Joined: Jan 26, 2009
Posts: 34
Hi Everyone,

I need to parse three or four HTML pages to extract data from them.

An example of the pages is:


In this case what I can be doing is:



Then by reading each line I can check the string for the classes 'value' and 'symbol'.

What I want to know is that whether there is a more efficient way to do this? Tomorrow it may happen that the class names can change. So I don't want my application to be tightly coupled to the HTML page.

Anybody has any suggestions?

SCJP 1.6 86%, SCWCD 5.0 on
Ulf Dittmer
Rancher

Joined: Mar 22, 2005
Posts: 42954
    
  73
HtmlUnit
 
rocket mass heater kickstarter
 
subject: How would be the best way to parse HTML Content ?