I'm having a bit of trouble with charsets and encodings. My problem is specifically related to JavaMail and webapps, but I'm posting in the general forum , because I think my difficulty is in a general misunderstanding of charsets/encoding.
I've got a website that I am in the process of i18n-enabling.
The two languages are French and English. So far, I've had no real trouble with the french accented characters. Everything just appears to work the way I'd expect.
In TextPad, I can see my � and � (and any other accents) fine. I view the file info and it tells me my document "code set" is ANSI. Not sure what 'code set' is, perhaps they mean char set?
Anyways.. I upload the i18n properties file containing French words (and thus, special characters) to my web server. I then use the java.util.Locale to retrieve the localized text and it all works. The web pages have the �, etc, etc.
Another part of the site I'm i18n'ing is generated/feedback emails. The body of the emails contain static text as well as dynamic. The static text is being pulled out of the properties file as well. When I pull these out of the file, and send them through JavaMail, I get message bodies that look like:
"D?sol?s. Le syst?me d?extraction des mots de passe est pr?sentement hors d?usage."
When it should read: "D�sol�s. Le syst�me d'extraction des mots de passe est pr�sentement hors d'usage."
The special characters are not being properly decoded? It's using the wrong charset?
I view the message headers, and observe: Content-Type: text/plain; charset=ANSI_X3.4-1968
I was under the impression that UTF-8 was Java's 'default' ?
Investigating my System properties programmatically, I discover:
file.encoding = ANSI_X3.4-1968
Hmm.. the same as my email.
To make matters worse, there are other emails the system generates that have a different header (just text/plain, with no charset specified), and *these* emails manage to output the correct special characters.