This week's book giveaway is in the OCAJP 8 forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line! See this thread for details.
i am trying to use httpclient to get content from a url which returns me xml.
it is working properly for english but for Thai or Chinese character i am getting message like <Text>à¸à¸²à¸¡à¸ªà¸à¸¸à¸¥</Text>
instead of proper Thaicharacters.
i tried to set encoding like
but it is not working.
Character encoding problems can be confusing to solve, because the problem can be in one of many places. For example, it could be at the point where you read the data and convert it to characters, or at the point where you display the data.
Check every step of the process, from receiving to displaying the data, and make sure you're aware of what character encoding is used at each step.
How are you displaying the data? Note that the Windows command prompt can't deal properly with different character encodings, so if you print stuff with System.out.println() you might see strange characters instead of the Thai letters you expect.
It appears that you're trying to download XML using HttpClient. Remember that the encoding you should use to interpret the data in the response is the encoding of the HTTP response and not the encoding declared in the XML document. So defaulting the encoding to UTF-8 might be the wrong thing to do. When I encountered this sort of problem my fix was to find the encoding used in the response -- there's a method in HttpClient which allows you to do that.