| Author |
what should the default charset be?
|
Mike Curwen
Ranch Hand
Joined: Feb 20, 2001
Posts: 3695
|
|
I'm having a bit of trouble with charsets and encodings. My problem is specifically related to JavaMail and webapps, but I'm posting in the general forum , because I think my difficulty is in a general misunderstanding of charsets/encoding. I've got a website that I am in the process of i18n-enabling. The two languages are French and English. So far, I've had no real trouble with the french accented characters. Everything just appears to work the way I'd expect. In TextPad, I can see my � and � (and any other accents) fine. I view the file info and it tells me my document "code set" is ANSI. Not sure what 'code set' is, perhaps they mean char set? Anyways.. I upload the i18n properties file containing French words (and thus, special characters) to my web server. I then use the java.util.Locale to retrieve the localized text and it all works. The web pages have the �, etc, etc. Another part of the site I'm i18n'ing is generated/feedback emails. The body of the emails contain static text as well as dynamic. The static text is being pulled out of the properties file as well. When I pull these out of the file, and send them through JavaMail, I get message bodies that look like: "D?sol?s. Le syst?me d?extraction des mots de passe est pr?sentement hors d?usage." When it should read: "D�sol�s. Le syst�me d'extraction des mots de passe est pr�sentement hors d'usage." The special characters are not being properly decoded? It's using the wrong charset? I view the message headers, and observe: Content-Type: text/plain; charset=ANSI_X3.4-1968 I was under the impression that UTF-8 was Java's 'default' ? Investigating my System properties programmatically, I discover: file.encoding = ANSI_X3.4-1968 Hmm.. the same as my email. To make matters worse, there are other emails the system generates that have a different header (just text/plain, with no charset specified), and *these* emails manage to output the correct special characters. Where might my encodings/charsets be off?
|
 |
 |
|
|
subject: what should the default charset be?
|
|
|