wood burning stoves
The moose likes Java in General and the fly likes Logic of HTML Parsing Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Logic of HTML Parsing" Watch "Logic of HTML Parsing" New topic

Logic of HTML Parsing

Lalit Nagalkar
Ranch Hand

Joined: Aug 22, 2006
Posts: 47
HI all,

I want to create a class able to parse HTML page and create a tree structure to display all the elements along with their attributes and data (like links to files, or text etc), if any.

I am aware that many have designed this thing.
I don't want complete code, but logic how it's done and some code snipets.

I will be thankfull to all you friends for the help.
I hope you have understood what I mean. For any elaboration pease ask.

Lalit Nagalkar

SCJP 1.4
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42958
As you said, there are a number of decent libraries available that do this (like jTidy, TagSoup, NekoXNI, ...). The easiest might be to study their approach; I'm sure you'd get a wide range of ideas from that.
Akhilesh Trivedi
Ranch Hand

Joined: Jun 22, 2005
Posts: 1599
In addition to Ulf's comments, you may like to check out this as well.

Keep Smiling Always — My life is smoother when running silent. -paul
[FAQs] [Certification Guides] [The Linux Documentation Project]
I agree. Here's the link: http://aspose.com/file-tools
subject: Logic of HTML Parsing
It's not a secret anymore!