Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

issue with org.apache.http.client.HttpClient

 
Amitosh Mishra
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,
i am trying to use httpclient to get content from a url which returns me xml.
it is working properly for english but for Thai or Chinese character i am getting message like <Text>นามสกุล</Text>
instead of proper Thaicharacters.
i tried to set encoding like
httpclient.getParams().setParameter("http.protocol.content-charset", "UTF-8");
but it is not working.

please provide necessary help.

thanks
Amitosh
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15207
36
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Character encoding problems can be confusing to solve, because the problem can be in one of many places. For example, it could be at the point where you read the data and convert it to characters, or at the point where you display the data.

Check every step of the process, from receiving to displaying the data, and make sure you're aware of what character encoding is used at each step.

How are you displaying the data? Note that the Windows command prompt can't deal properly with different character encodings, so if you print stuff with System.out.println() you might see strange characters instead of the Thai letters you expect.
 
Paul Clapham
Sheriff
Pie
Posts: 20769
30
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It appears that you're trying to download XML using HttpClient. Remember that the encoding you should use to interpret the data in the response is the encoding of the HTTP response and not the encoding declared in the XML document. So defaulting the encoding to UTF-8 might be the wrong thing to do. When I encountered this sort of problem my fix was to find the encoding used in the response -- there's a method in HttpClient which allows you to do that.
 
Amitosh Mishra
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks all for your help.

i resolved the issue by using

HttpResponse response = client.execute(httpget);
HttpEntity entity = response.getEntity();
String result = EntityUtils.toString(entity,"UTF-8");

regards
Amitosh

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic