• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Mysterious reading

 
Don Redd
Ranch Hand
Posts: 82
Eclipse IDE Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I am trying to understand behavior of System.in.read(byte[]) .

In the below code.



CONSOLE:
Ð
1
-48---11111111111111111111111111010000
Ð



Ð----is a character with Unicode value of U+0189 and its 2 byte or 16 bit representation is 0000 0001 1000 1001
Now , output of line2 says that , System.read has read 1 byte from console (std input) ,
Question 1 is which byte of Ð does it read , 0x01 or 0x89 ???
Question 2 is why is it printing -48 in line no 3.
Also i couldn't understand ,HOW String constructed 2 byte information Ð, even though it only read one byte?



 
Winston Gutkowski
Bartender
Pie
Posts: 10090
55
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don Redd wrote:Ð----is a character with Unicode value of U+0189 and its 2 byte or 16 bit representation is 0000 0001 1000 1001
Now , output of line2 says that , System.read has read 1 byte from console (std input) ,
Question 1 is which byte of Ð does it read , 0x01 or 0x89 ???
Question 2 is why is it printing -48 in line no 3.
Also i couldn't understand ,HOW String constructed 2 byte information Ð, even though it only read one byte?

Ooof. Where to start?

First, you have several misconceptions going on:
1. You are NOT reading in Unicode, or even characters, you are reading in bytes.
2. It's highly unlikely that your system console uses any multibyte character set at all. In fact, I'd care to bet that it uses Windows-1252.
3. Ð in Windows-1252 is code 208 (decimal), NOT 0189.
4. Java bytes are signed.
5. What do you get when you try to print out the value of Byte.valueOf((byte)208)?

HIH

Winston
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic