This week's giveaway is in the EJB and other Java EE Technologies forum.
We're giving away four copies of EJB 3 in Action and have Debu Panda, Reza Rahman, Ryan Cuprak, and Michael Remijan on-line!
See this thread for details.
The moose likes Tomcat and the fly likes Uploaded files have characters changed !!! Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Products » Tomcat
Bookmark "Uploaded files have characters changed !!!" Watch "Uploaded files have characters changed !!!" New topic
Author

Uploaded files have characters changed !!!

Derek Clarkson
Greenhorn

Joined: Mar 04, 2004
Posts: 25
Hi all, I'm not sure if this is the right forum for this. I'm trying to upload a file containing unicode text to a Tomcat server. On my local system (windows XP, Tomcat 4.1.29) it works just file. But when I intall on a test server (Linux, tomcat 4.1.19) some of the characters in the uploads are changed. See below for an example. Can anyone explain this ??? It's got me stumped ;-(

Local Windows XP, Tomcat 4.1.29, from tomcat log. (correct)
Cleaner.showUnicode[194]: B unicode = \u0042
Cleaner.showUnicode[194]: a unicode = \u0061
Cleaner.showUnicode[194]: r unicode = \u0072
Cleaner.showUnicode[194]: r unicode = \u0072
Cleaner.showUnicode[194]: i unicode = \u0069
Cleaner.showUnicode[194]: unicode = \u0020
Cleaner.showUnicode[194]: G unicode = \u0047
Cleaner.showUnicode[194]: � unicode = \u00f2
Cleaner.showUnicode[194]: t unicode = \u0074
Cleaner.showUnicode[194]: i unicode = \u0069
Cleaner.showUnicode[194]: c unicode = \u0063
Cleaner.cleanText[170]: text = Barri G�tic

Test sever linux, Apache, Tomcat 4.1.29, from tomcat log.
Cleaner.showUnicode[194]: B unicode = \u0042
Cleaner.showUnicode[194]: a unicode = \u0061
Cleaner.showUnicode[194]: r unicode = \u0072
Cleaner.showUnicode[194]: r unicode = \u0072
Cleaner.showUnicode[194]: i unicode = \u0069
Cleaner.showUnicode[194]: unicode = \u0020
Cleaner.showUnicode[194]: G unicode = \u0047
Cleaner.showUnicode[194]: ��� unicode = \ufffd
Cleaner.showUnicode[194]: t unicode = \u0074
Cleaner.showUnicode[194]: i unicode = \u0069
Cleaner.showUnicode[194]: c unicode = \u0063
Cleaner.cleanText[170]: text = Barri G���tic
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
F2 is the correct encoding for the character in question only when using ISO-8859-1 encoding. Apparently, that's the encoding your local Tomcat runs with. The Linux server may use something else.
1) Check what the server runs with (can also be set in the shell scripts that are used to start Tomcat), and adjust it if it's not ISO-8859-1 (UTF-8 would also be OK, I think, because the first 256 characters are identical to ISO-8859-1).
2) Make sure all JSP pages are generated with the same proper encoding (don't know offhand what the default is).
3) If the characters go anywhere outside Tomcat on the server (file system, database), those steps need to use the same encoding as well.


Ping & DNS - updated with new look and Ping home screen widget
 
wood burning stoves
 
subject: Uploaded files have characters changed !!!
 
Similar Threads
I don't get it!
char literals
Unicode Value????
CHEECKY but i need help FAST please!
Unreadable code ?!