aspose file tools*
The moose likes Java in General and the fly likes why the length of array is same? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "why the length of array is same?" Watch "why the length of array is same?" New topic
Author

why the length of array is same?

vikas varshney
Greenhorn

Joined: Dec 10, 2011
Posts: 3



String s="I m the best";
byte [] b=s.getBytes();

When I found the length of Byte array it was the same as of String length...

while In String Each character takes 2 Bytes as it uses Unicode for the character...
Why is it so??? it must be double of string length....

Tim Moores
Rancher

Joined: Sep 21, 2011
Posts: 2408
There is no "Unicode" encoding. Unicode defines various encodings (like UTF-8, UTF-16, UTF-32 etc.). UTF-8 in particular -the most commonly found Unicode encoding- does NOT use 2 bytes for all characters. Specifically, for characters below 128 it is identical to ASCII, and will thus use single bytes for each character. (To confuse you further, UTF-8 can take between 1 and 6 bytes for a character...)

You should read this: http://www.joelonsoftware.com/articles/Unicode.html

If you use String.getBytes it probably does not use any of the Unicode encodings, though - it uses the platform default encoding. That could be Cp-1252, MacRoman, ISO-8859-1, UTF-8 or any of a number of other encodings. If you want to encode in UTF-8, call String.getBytes("UTF-8").

while In String Each character takes 2 Bytes as it uses Unicode for the character.

Yes - the JVM internally uses UTF-16, which allocates two bytes for each character. But there is no way of getting at the internal representation of any Java object, so that fact is moot.
Joanne Neal
Rancher

Joined: Aug 05, 2005
Posts: 3742
    
  16
Read the javadoc for the String.length method


Joanne
vikas varshney
Greenhorn

Joined: Dec 10, 2011
Posts: 3
what if I use a String that contains the Character that use only UTF-16 encoding(May be in any other language character)???
then what will happen???
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18992
    
    8

Try it and see.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: why the length of array is same?