• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

number of characters in UTF-8

 
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I'm trying to find the number of chars (letters) in a word that is in UTF-8 format - not the number of bytes.
String.length() returns the length in UTF-16 format.
For most words the simple String.length() works, but for example:
String s = "文書の場合";
The length (s.length()) is 9.

I using the String constructer String(byte[]�bytes, String�charsetName), but the length method is still UTF-16.


Thanks
 
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can you post the string as unicode escapes? (i.e. \uxxxx)

How about: toCharArray().length?
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This article is highly relevant to your question.
 
maya fur
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks you very much for the article! It was very helpful!
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic