aspose file tools*
The moose likes Java in General and the fly likes issue with org.apache.http.client.HttpClient Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "issue with org.apache.http.client.HttpClient" Watch "issue with org.apache.http.client.HttpClient" New topic
Author

issue with org.apache.http.client.HttpClient

Amitosh Mishra
Ranch Hand

Joined: Feb 11, 2010
Posts: 49
Hi all,
i am trying to use httpclient to get content from a url which returns me xml.
it is working properly for english but for Thai or Chinese character i am getting message like <Text>นามสกุล</Text>
instead of proper Thaicharacters.
i tried to set encoding like
httpclient.getParams().setParameter("http.protocol.content-charset", "UTF-8");
but it is not working.

please provide necessary help.

thanks
Amitosh
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14274
    
  21

Character encoding problems can be confusing to solve, because the problem can be in one of many places. For example, it could be at the point where you read the data and convert it to characters, or at the point where you display the data.

Check every step of the process, from receiving to displaying the data, and make sure you're aware of what character encoding is used at each step.

How are you displaying the data? Note that the Windows command prompt can't deal properly with different character encodings, so if you print stuff with System.out.println() you might see strange characters instead of the Thai letters you expect.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18675
    
    8

It appears that you're trying to download XML using HttpClient. Remember that the encoding you should use to interpret the data in the response is the encoding of the HTTP response and not the encoding declared in the XML document. So defaulting the encoding to UTF-8 might be the wrong thing to do. When I encountered this sort of problem my fix was to find the encoding used in the response -- there's a method in HttpClient which allows you to do that.
Amitosh Mishra
Ranch Hand

Joined: Feb 11, 2010
Posts: 49
Thanks all for your help.

i resolved the issue by using

HttpResponse response = client.execute(httpget);
HttpEntity entity = response.getEntity();
String result = EntityUtils.toString(entity,"UTF-8");

regards
Amitosh

 
Don't get me started about those stupid light bulbs.
 
subject: issue with org.apache.http.client.HttpClient