This week's book giveaway is in the Mac OS forum.
We're giving away four copies of a choice of "Take Control of Upgrading to Yosemite" or "Take Control of Automating Your Mac" and have Joe Kissell on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes Setting encoding in web.xml Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Setting encoding in web.xml" Watch "Setting encoding in web.xml" New topic
Author

Setting encoding in web.xml

Naseem Khan
Ranch Hand

Joined: Apr 25, 2005
Posts: 809
Hi,
I was just curious to know why we set encoding in the web.xml at the very first line. I removed the encoding attribute from xml tag, still web application deployed correctly on tomcat.



What will happen if we change this encoding to say UTF-8 or some other encoding scheme like UTF-16 etc.
[ February 28, 2007: Message edited by: Naseem Khan ]

Asking Smart Questions FAQ - How To Put Your Code In Code Tags
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

You specify the encoding in the prolog of an XML document so that the XML parser will be able to determine the encoding of the document. Most operating systems have no way for a user to determine what encoding was used to create a file, so XML requires the creator of the file to do that.

Now the name you put in the "encoding" attribute should be the actual encoding of the file. If you change it to something else then you would be lying to the XML parser and it would misinterpret the data in the document. This might cause it to throw an exception or it might cause it to garble the data in the document. And if it isn't already obvious, changing the "encoding" attribute doesn't change the actual encoding of the file.

However if all of the data in the document is plain US-ASCII characters, then US-ASCII and ISO-8859-1 and UTF-8 versions of the document will all look identical, because they all encode US-ASCII characters in the same way. But UTF-16 encodes them in a completely different way.

You should read this tutorial to learn more about Unicode in the context of XML.
 
wood burning stoves
 
subject: Setting encoding in web.xml