File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes How to get the string offsets of a tag found from Document Object? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "How to get the string offsets of a tag found from Document Object?" Watch "How to get the string offsets of a tag found from Document Object?" New topic
Author

How to get the string offsets of a tag found from Document Object?

steve labar
Ranch Hand

Joined: Sep 10, 2008
Posts: 55
I'm currently using Mozilla Html Parser to take a string representation of an html response and parses it into a document object. At that point i'm traversing this DOM trying to find a tag. Once i find a specific target tag lets say with a specified attribute i'd like to get the string offsets of where that was found. I need this because i have the html in a textpane that i highlight that found target area using the offsets?

before i was using regex to find the tag and just passed the matcher offsets found to my highlighter class.

However, i think using html parser is better way just having difficulty finding the match in the corresponding html string?

any ideas?

psuedo code example:

>
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18563
    
    8

If you can't figure this stuff out for yourself, then sorry, you have a big headache.

I haven't ever heard of this parser. Normally I don't mind reading the API documentation of open-source packages on behalf of people on forums, but those guys don't have a link to the API on their site. (Downloading the whole thing and extracting the API is beyond my curiosity level.) And they don't have a forum or a mailing list as far as I can see. One page of their documentation says their parser is "compatible with SAX parsers", whatever that means. Normally with a SAX parser you would attach an org.xml.sax.Locator object to your content handler and use that, but I have no idea how that relates to your code.

Edit: after some more googling I found that they do have a forum.

But you could try TagSoup, it has a SAX interface.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to get the string offsets of a tag found from Document Object?