Which method should i use for parsing an html file ?
converting html-to-xml and then using sax/dom parsers or is there any other method or API's available for fast and efficient parsing of html files ?? thanks in advance
I use the Quiotix Parser with good effect. It uses the Visitor Pattern to walk the tree of nodes. I've done some little projects that modify the tree while visiting. [ February 10, 2004: Message edited by: Stan James ]
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Freddy Villalba
Greenhorn
Joined: Feb 11, 2004
Posts: 6
posted
0
I've been using Pajes for 2 years now and I have to say it's quite decent. Take a look at www.pajes.org. HTH, Freddy.
Adrian Yan
Ranch Hand
Joined: Oct 02, 2000
Posts: 688
posted
0
Best one out there, more completed and stable. Html parser
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to
run our stuff on 16 servers instead of 3.