File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Logic of HTML Parsing Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Logic of HTML Parsing" Watch "Logic of HTML Parsing" New topic
Author

Logic of HTML Parsing

Lalit Nagalkar
Ranch Hand

Joined: Aug 22, 2006
Posts: 47
HI all,

I want to create a class able to parse HTML page and create a tree structure to display all the elements along with their attributes and data (like links to files, or text etc), if any.

I am aware that many have designed this thing.
I don't want complete code, but logic how it's done and some code snipets.

I will be thankfull to all you friends for the help.
I hope you have understood what I mean. For any elaboration pease ask.

Thanks.
Lalit Nagalkar


SCJP 1.4
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39548
    
  27
As you said, there are a number of decent libraries available that do this (like jTidy, TagSoup, NekoXNI, ...). The easiest might be to study their approach; I'm sure you'd get a wide range of ideas from that.


Ping & DNS - updated with new look and Ping home screen widget
Akhilesh Trivedi
Ranch Hand

Joined: Jun 22, 2005
Posts: 1511
In addition to Ulf's comments, you may like to check out this as well.


Keep Smiling Always — My life is smoother when running silent. -paul
[FAQs] [Certification Guides] [The Linux Documentation Project]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Logic of HTML Parsing
 
Similar Threads
How to prompt if there is no match in database using servlet?
How to edit JTable in JEditorPane?
How do you write an application?
struts Action for asynchronous HTTP processing
HTML frontend, Java Backend