File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes parsing Html Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "parsing Html" Watch "parsing Html" New topic
Author

parsing Html

kishore goswami
Greenhorn

Joined: Aug 13, 2003
Posts: 18
Which method should i use for parsing an html file ?

converting html-to-xml and then using sax/dom parsers
or
is there any other method or API's available for fast and efficient parsing of html files ??
thanks in advance
john guthrie
Ranch Hand

Joined: Aug 05, 2002
Posts: 124
cactus uses nekoHtml
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
I use the Quiotix Parser with good effect. It uses the Visitor Pattern to walk the tree of nodes. I've done some little projects that modify the tree while visiting.
[ February 10, 2004: Message edited by: Stan James ]

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Freddy Villalba
Greenhorn

Joined: Feb 11, 2004
Posts: 6
I've been using Pajes for 2 years now and I have to say it's quite decent.
Take a look at www.pajes.org.
HTH,
Freddy.
Adrian Yan
Ranch Hand

Joined: Oct 02, 2000
Posts: 688
Best one out there, more completed and stable. Html parser
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: parsing Html