• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
  • Piet Souris
  • Himai Minh

Urgent -Need an API to identify the character encoding

Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Wondering if there are any JAVA APIs available to identify the character encoding of a content without the charset parameter on the content-type header. Please help needed immediately. I tried using NGramJ,



I used this but the CharsetToolKit identifies only among UTF-8, UTF-16LE and UTF-16 not any other encodings like TIS-620 etc. I am new to this as well, so not sure whether I am doing it right. Please advise.

Also, if any samples of chardet would be appreciated.

One thing not sure, is when I send a message has Thai characters from Hotmail having my browser setting to Thai encoding(TIS-620) but my Hotmail account language is English and sent to one of my exchange accounts. In the outlook, the message looks gibbrish.

So I need the charset encoding detector to let me know what type of encoding is done on the content (as if you choose English as the language option, the Hotmail server doesn't have charset parameter in the content-type header) so that I can decode and re-encode to UTF-8.

Any immediate response would be appreciated.
Then YOU must do the pig's work! Read this tiny ad. READ IT!
Free, earth friendly heat - from the CodeRanch trailboss
    Bookmark Topic Watch Topic
  • New Topic