I am using an HttpURLConnection to request a page from a server that has polish text in it.
For example, the page has Sprawdzenie nośności z opiekunem but when I print out the response to the console, I get Sprawdzenie no?no?ci z opiekunem.
This is how I am making the request:
The response page is encoded with the polish charset ISO-8859-2. This is how I am reading the response:
Any help or suggestions would be greatly appreciated.
Please let me know if you need any more information (firstname.lastname@example.org)
Also, I have tried using the java.nio.charset.CharsetDecoder to decode the page. I read the stream in as bytes and placed the bytes into a ByteBuffer, which didn't work.
You may be reading the data correctly (although you could just write new InputStreamReader(istream,"ISO-8859-2")), but not getting it to display on the console. Try displaying the data you receive with the GUI:
posted 11 years ago
I would just like to say thanks you for your reply... I appreciate it.
I am using RAD as my IDE and I am putting a break point in the code right before the string buffer is being printed to the console. When the code stops running at my break point I check the contents os the string buffer. It also has the ? in the polish text. I suspect that when the text from the in.readLine() method is assigned to the String inputLine, the text is being converted to UTF-8 instead of maintaining the charset encoding.
There won’t be any character conversion happening when assigning strings, since Java only copies a reference to the String object, and all Java strings are encoded in UTF-16 anyway. The InputStreamReader does the initial conversion from ISO-8859-2 to UTF-16, and the System.out.println() converts from UTF-16 back to the encoding in the file.encoding system property.
I wrote a short test program, and I can’t reproduce your problem. Can you see whether this works for you?
In this case, I get ? instead of ś and ż on my console, because its encoding is Cp1252, but JOptionPane displays the string correctly.
posted 11 years ago
I was able to get the characters to display in my RAD console by changing the JVM encoding to UTF-8 and changing the console font to a font that supports UTF-8 charset.
Thanks for your replies, much appreciated!
Yeah, but how did the squirrel get in there? Was it because of the tiny ad?