I'm stuck in this problem... I'm writing a program that behaves like a parser. It checks the HTML, ignore everything inside the tag, but extract numbers that are outside the tag (numbers that are visible via the use of browser). However the problem is: if the program reads in character by character. When it comes to '<', it will think this is an open tag and will ignore everything until a '>' comes up So for example in the following sentence: three < five<br /> The system will continue to look for '>' and never terminate. What's the best solution to that? Are there any way to identify HTML tag? Thanks in advance. Flora
The traditional way around it is to use "& lt' and '& gt' (without the spaces) if you want to display greater to and less than and know that they are not html. This of course only works if YOU get to control the input into the html page.
[This message has been edited by Cindy Glass (edited August 16, 2001).]
"JavaRanch, where the deer and the Certified play" - David O'Meara
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com