I have a JSP that contains an html form which posts back to itself. This form needs to accept international text, German and Japanese for example. So, I have used a page directive to set the contentType of the JSP to "text/html; charset=UTF-8" and characterEncoding to "UTF8". I'm also calling request.setCharacterEncoding("UTF8") when handling the post-back before any other method is called on the request object. This being deployed to Tomcat 5.0. The problem I'm having is that the request parameters are still being read as Latin-1. (Meaning that if I enter the following German text into the html form, "Schr�der nennt R�cktrittsger�chte �wildeste Spekulation�" when I call request.getParameter it returns, "Schr��der nennt R��cktrittsger��chte ��wildeste Spekulation��" ) I've tried setting every character encoding/content type related setting on the JSP, the request, the response, and Tomcat that I can dig up but with no success. After becoming completely frustrated with the problem, I created a servlet to handle the post, as I should have to begin with, just to see what would happen. To my surprise, after copying the exact same code from my JSP into the servlet's doPost and having the form post to the servlet, the request parameters come across perfectly fine as UTF-8 encoded Strings. Obviously, I can just have my forms post to a servlet or I can explicity convert the strings from Latin-1 to UTF-8. But I would really like to understand why the JSP does not handle character encoding for request parameters in the same way that the servlet does. Is this a known issue? If anyone has any insight or ideas they would be greatly appreciated. The code for both the JSP and the servlet is posted below, along with the console output.
Servlet Output: Request Encoding: UTF8 Text Value: Schr�der nennt R�cktrittsger�chte �wildeste Spekulation� Text From Latin1 to UTF8: Schr.der nennt R.cktrittsger.chte .wildeste Spekulation. Input Stream Encoding: UTF8 Output Stream Encoding: UTF8 System File Encoding Property: UTF-8
[BSouther: Added UBB CODE tags] [ April 25, 2007: Message edited by: Ben Souther ]
I too have face same kind of issue. i am uploading Textfile having chars of "ISO8859_1" encoding type. this text is then set in textArea on my jsp upto this no issues but as soon as i submit my form this text is corrupted & have some other Chars.
so i used
this work fine
But My Questions are 1)As it is a HardCoding of encoding we can not used for other encoding correct me i am not sure? 2)Is it a final solution to this problem 3)can we identify encoding type of text so that we can put a condition before above statement 4) i also used But this code is not working & have same issue on Linux OS
The encodings are called "UTF-8" and "ISO-8859-1", respectively, not "UTF8" and "ISO8859-1" or "ISO-8859_1". Make sure that that's what you use; maybe it makes a difference.
Using a JSP should not make a difference compared to using a servlet, and in any case it's just what the HTTP header uses to tell the browser what to do. If it's incorrect, then you should be able to switch the encoding manually in the browser and get a proper page view. If you don't, then there is something the matter with the actual page content, and not just with the encoding setting.
i used encoding type speling correction suggested by you.but i am still having same issue . my concern is that as my application support diffent language(swedish,italian,Germany,English,spanish & french) so i can not used above code in Action as it seems hardcoding & it may affect others too.(not sure) my problem is when i submit form text content in textarea , it get changed. i tried diff settings of request & response setting but i didn't succeed. i need concrete info which shows strCreative.getBytes("ISO-8859-1") will not affect if other encoding text is used.