posted 15 years ago
Welcome to JavaRanch.
Doing this without changing the format is not possible, because plain text does not contain formatting instructions. I'd look into a library that can convert HTML to XML (like TagSoup or NekoXNI), and then use the SAX API to extract all text from the XML.