A friendly place for programming greenhorns!
Big Moose Saloon
Register / Login
Win a copy of
Elasticsearch in Action
this week in the
Java in General
Logic of HTML Parsing
Joined: Aug 22, 2006
Nov 08, 2008 02:22:00
I want to create a class able to parse HTML page and create a tree structure to display all the elements along with their attributes and data (like links to files, or text etc), if any.
I am aware that many have designed this thing.
I don't want complete code, but logic how it's done and some code snipets.
I will be thankfull to all you friends for the help.
I hope you have understood what I mean. For any elaboration pease ask.
Joined: Mar 22, 2005
Nov 08, 2008 02:45:00
As you said, there are a number of decent libraries available that do this (like jTidy, TagSoup, NekoXNI, ...). The easiest might be to study their approach; I'm sure you'd get a wide range of ideas from that.
Joined: Jun 22, 2005
Nov 11, 2008 03:28:00
In addition to Ulf's comments, you may like to check out
Keep Smiling Always — My life is smoother when running silent. -paul
The Linux Documentation Project
I agree. Here's the link:
subject: Logic of HTML Parsing
How do you write an application?
How to edit JTable in JEditorPane?
How to prompt if there is no match in database using servlet?
struts Action for asynchronous HTTP processing
HTML frontend, Java Backend
All times are in JavaRanch time: GMT-6 in summer, GMT-7 in winter
| Powered by
Copyright © 1998-2015