Hi - I need to get arabic into a javastring. Have saved the arabic as UTF-8, wondering about the correct way to get that into a string? googling gives me lots of suggestions, so just wondering which is the correct one.
I am having some difficulty with UTF-8 encoded chracaters in Java.
My XML has a question which has cyrillic characters. My Java servlet renders this as HTML with a form for the reply. The HTML produced displays OK in the browser (the response type on the Java servelet has to be set to "text/html; charset=UTF-8" for this to work).
I have to send cyrillic characters back in the response to the question, in a text field on the form. The browser is sending back a byte stream (which I am printing here as hex): d0b3d0bed180d0bed0b4 (this is a cyrillic word correctly coded as utf-8).
However, on collecting the response (using request.getParameterValues(fieldname))the servlet returns the byte stream: d0b3d0bed13fd0bed0b4. A mistake in the fifth byte!
Has anyone heard of this problem? I suspect the problem is in the JAVA UTF-8 converter.
Joined: Nov 09, 2005
I now know the answer, thanks to Bruno Van Haetsdaele .
Before calling request.getParameterValues(fieldname)); one should call request.setCharacterEncoding("UTF-8");
Originally posted by Lucy Sommerman: just to check.
and the string itself will be UTF8 though and not converted to UTF 16? this is plugging into something else, will not handle UTF 16 - thanks
Strings are sequences of characters which are 16-bit (UFT-16). You can (and probably need to) convert the String to byte array or write to stream to plug it into "something else". In both cases character encoding can be specified.