Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

parsing Html

 
kishore goswami
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Which method should i use for parsing an html file ?

converting html-to-xml and then using sax/dom parsers
or
is there any other method or API's available for fast and efficient parsing of html files ??
thanks in advance
 
john guthrie
Ranch Hand
Posts: 124
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
cactus uses nekoHtml
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I use the Quiotix Parser with good effect. It uses the Visitor Pattern to walk the tree of nodes. I've done some little projects that modify the tree while visiting.
[ February 10, 2004: Message edited by: Stan James ]
 
Freddy Villalba
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've been using Pajes for 2 years now and I have to say it's quite decent.
Take a look at www.pajes.org.
HTH,
Freddy.
 
Adrian Yan
Ranch Hand
Posts: 688
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Best one out there, more completed and stable. Html parser
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic