I want to create a class able to parse HTML page and create a tree structure to display all the elements along with their attributes and data (like links to files, or text etc), if any.
I am aware that many have designed this thing. I don't want complete code, but logic how it's done and some code snipets.
I will be thankfull to all you friends for the help. I hope you have understood what I mean. For any elaboration pease ask.
Thanks. Lalit Nagalkar
Joined: Mar 22, 2005
As you said, there are a number of decent libraries available that do this (like jTidy, TagSoup, NekoXNI, ...). The easiest might be to study their approach; I'm sure you'd get a wide range of ideas from that.