Help coderanch get a
new server
by contributing to the fundraiser
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Devaka Cooray
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Tim Moores
  • Carey Brown
  • Mikalai Zaikin
Bartenders:
  • Lou Hamers
  • Piet Souris
  • Frits Walraven

Converting '>' to "& l t ;"

 
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am reading in an html file and converting it to xml. The problem is that when I come across special characters like < > & ... it makes the xml invalid when you try to read it with explorer. I started to write code to convert '<' to "& l t ;" (without spaces) but I keep running into new ones every time I add one to the list. Is there a java class that I can use to convert these symbols or find a list of invalid xml characters and there appropriate alternative?
 
Leverager of our synergies
Posts: 10065
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can use Tidy for automatic HTML -> XML conversion. It converts all illegal symbols into escape sequences.
[This message has been edited by Mapraputa Is (edited July 10, 2001).]
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic