aspose file tools*
The moose likes Beginning Java and the fly likes getBytes() in String Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "getBytes() in String" Watch "getBytes() in String" New topic
Author

getBytes() in String

krishna nav
Greenhorn

Joined: May 25, 2006
Posts: 7
String s = new String("12345");
byte a[] = s.getBytes();

for(int i=0; i< a.length;i++)
System.out.println(a[i]);

i am getting ASCII values as output, but according to JavaDoc i should get 1 2 3 4 5 as output right ?
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24183
    
  34

The byte[] will contain character codes for the characters in the String, using some default encoding which will vary from country to country. For many installations, you will indeed effectively get the ASCII codes for the characters 1, 2, 3, 4, and 5: 49, 50, 51, 52, 53. If you want the characters themselves, use toCharArray().


[Jess in Action][AskingGoodQuestions]
Edwin Dalorzo
Ranch Hand

Joined: Dec 31, 2004
Posts: 961
You get different values from the getBytes() method depending on the specific character encoding you are using. Because that's what getBytes does, that's to say, assign one or more bytes to represent a character depending of the encoding.

ASCII values fit in one byte, but other kind of character encodings contain more than 256 characters. Like the unicode set, for instance. Hence, Java has to manipulate the corresponding characters using more than one byte.

You can determine the current character encoding settings by means of the System class:

String currentEncoding = System.getProperty("file.encoding");

Or by means of the Charset class: Charset.defaultCharset()

You could change the default encoding used by your application by means of setting this variable when you lauch the application, for instance:

> java -Dfile.encoding=UTF-8
> java -Dfile.encoding=ASCII
> java -Dfile.encoding=UTF-16
> java -Dfile.encoding=Cp1252
> java -Dfile.encoding=Cp500

If you, for instance, use UTF-16 every character will ocupy two bytes, but if you use ASCII, every character will occupy just one byte.

The String class has a method getBytes(String charset) that lets you set the encoding used to generate the bytes.

Notice how using different encoding yield different number of bytes:



Another option is to use the CharsetEncoder and CharsetDecoder classes.

If you use ASCII the generated bytes will correspond with the ASCII character numbers, it means that if you assign every byte of the array to a char variable you will get the corresponding ASCII charactrer back again:



But it will print unknow characters if you use another encoding.
[ May 26, 2006: Message edited by: Edwin Dalorzo ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: getBytes() in String