• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

escaping foreign chars in generating xml, why?

 
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I just started working on a system where foreign characters such as � � � in xml output automatically (using Spring HtmlUtils) are escaped to it's equivalant HTML character reference. I don't really see the point in doing this if you have a db that stores its data as utf-8 and a webserver also serving pages as utf-8.

Are there are advantages/disadvantages using html char references instead of outputting foreign chars just as they are?

Many thanks
E
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The advantage is that when you do that, the XML you produce is resistant to being botched up by mis-encoding. You may be carefully ensuring that everything you do is encoded in UTF-8 but that is certainly not a common attitude in the Web world.
 
Sven Anderson
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Paul,

Does this mean that if you have an environment where database/web-server successfully serve utf-8 you shouldn't really have to bother with escaping characters and instead rely on the utf-8 encoding and leave the characters as they are?

Thanks
E
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, roundtripping of UTF-8 text from DB through web server to browser, back to web server and into the database is possible, and it's not even all that difficult. For starters, make sure that the DB encoding is set to Unicode, and that all pages you serve are declared as UTF-8 encoded.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic