File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes parsing Html Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "parsing Html" Watch "parsing Html" New topic

parsing Html

kishore goswami

Joined: Aug 13, 2003
Posts: 18
Which method should i use for parsing an html file ?

converting html-to-xml and then using sax/dom parsers
is there any other method or API's available for fast and efficient parsing of html files ??
thanks in advance
john guthrie
Ranch Hand

Joined: Aug 05, 2002
Posts: 124
cactus uses nekoHtml
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
I use the Quiotix Parser with good effect. It uses the Visitor Pattern to walk the tree of nodes. I've done some little projects that modify the tree while visiting.
[ February 10, 2004: Message edited by: Stan James ]

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Freddy Villalba

Joined: Feb 11, 2004
Posts: 6
I've been using Pajes for 2 years now and I have to say it's quite decent.
Take a look at
Adrian Yan
Ranch Hand

Joined: Oct 02, 2000
Posts: 688
Best one out there, more completed and stable. Html parser
I agree. Here's the link:
subject: parsing Html
It's not a secret anymore!