• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

Byte -127 Question

 
Ranch Hand
Posts: 81
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am struggling to get a String representation of a 4 byte array that represents an Integer (written from DataOutputStream). It seems that whenever one of the bytes results in a -127 value (HEX 0X81), when I convert the byte array to a String using new String(byte[]), a subsequent call to String.getBytes() converts the value -127 to 63.

I can share the complete code if needed, but I found I can reproduce this with a snippet as below

************************
byte[] bb = new byte[1];
bb[0] = -127;

String aString = new String(bb);

byte[] bb2 = aString.getBytes();
**************************

Inspecting the bb2[0] byte shows a 63 instead of -127.

Kind of an obscrue question I am assuming, but anyone have any ideas on how to solve this?

Thanks in advance for any assistance, much appreciated
 
Ranch Hand
Posts: 423
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
String(byte[]) constructor and getBytes() function try to map byte value into unicode character using default (platform) charset.
Look at api first: http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html#String(byte[])
for String(bytes[]) they wrote:

The behavior of this constructor when the given bytes are not valid in the default charset is unspecified


for getBytes() they wrote:

The behavior of this method when this string cannot be encoded in the default charset is unspecified.



Then run the code below and check what is default charset on your JVM:


On my computer (Windows XP code page 1252) default charset is UTF-8.
If you look at this encoding (http://en.wikipedia.org/wiki/UTF-8), you will see that "byte" codes 128-193, 245-255 are invalid,
only codes from range 0-127 are "green" (allowed), the rest have special meaning.
Code 0x81 in this charset means "start of 2-byte sequence", so a single byte 0x81 is invalid too.

If some byte value has no valid representation in your charset, then these functions cannot map
this byte to unicode and will give strange results


This code gives you better results:
>
 
Brian Mozhdehi
Ranch Hand
Posts: 81
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you for your help, very much appreciated. That makes sense.
 
This parrot is no more. It has ceased to be. Now it's a tiny ad:
Thread Boost feature
https://coderanch.com/t/674455/Thread-Boost-feature
reply
    Bookmark Topic Watch Topic
  • New Topic