File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Converting '>' to Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Reply Bookmark "Converting Watch "Converting New topic
Author

Converting '>' to "& l t ;"

Chris Behr
Greenhorn

Joined: Jun 12, 2001
Posts: 17
I am reading in an html file and converting it to xml. The problem is that when I come across special characters like < > & ... it makes the xml invalid when you try to read it with explorer. I started to write code to convert '<' to "& l t ;" (without spaces) but I keep running into new ones every time I add one to the list. Is there a java class that I can use to convert these symbols or find a list of invalid xml characters and there appropriate alternative?
Mapraputa Is
Leverager of our synergies
Sheriff

Joined: Aug 26, 2000
Posts: 10065
You can use Tidy for automatic HTML -> XML conversion. It converts all illegal symbols into escape sequences.
[This message has been edited by Mapraputa Is (edited July 10, 2001).]


Uncontrolled vocabularies
"I try my best to make *all* my posts nice, even when I feel upset" -- Philippe Maquet
 
 
subject: Converting '>' to "& l t ;"
 
Threads others viewed
Including special characters in an XML which is validated against a schema
Incorrect encoding in XML
Displaying Non-English Characters in XML attribute
Invalid/Special XML characters such as &, ', "
filtering illegal characters in xml documents
developer file tools