Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to get the string offsets of a tag found from Document Object?

 
steve labar
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm currently using Mozilla Html Parser to take a string representation of an html response and parses it into a document object. At that point i'm traversing this DOM trying to find a tag. Once i find a specific target tag lets say with a specified attribute i'd like to get the string offsets of where that was found. I need this because i have the html in a textpane that i highlight that found target area using the offsets?

before i was using regex to find the tag and just passed the matcher offsets found to my highlighter class.

However, i think using html parser is better way just having difficulty finding the match in the corresponding html string?

any ideas?

psuedo code example:

>
 
Paul Clapham
Sheriff
Posts: 21107
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you can't figure this stuff out for yourself, then sorry, you have a big headache.

I haven't ever heard of this parser. Normally I don't mind reading the API documentation of open-source packages on behalf of people on forums, but those guys don't have a link to the API on their site. (Downloading the whole thing and extracting the API is beyond my curiosity level.) And they don't have a forum or a mailing list as far as I can see. One page of their documentation says their parser is "compatible with SAX parsers", whatever that means. Normally with a SAX parser you would attach an org.xml.sax.Locator object to your content handler and use that, but I have no idea how that relates to your code.

Edit: after some more googling I found that they do have a forum.

But you could try TagSoup, it has a SAX interface.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic