Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!

# Bits & Bytes

Sam Samson
Ranch Hand
Posts: 63
Hi

I try to understand how an a Byte/Short/Integer/Long is represented by a binary.

Normally 1111 1111 1111 1111 1111 1111 1111 1100 stands for 4294967292, but in Java it's -4. Because of something called two's complement, right?
How can I calculate this? Do I have to do 1111 1111 1111 1111 1111 1111 1111 1100 - binary of 1 and then invert the bits? Can someone explain to me how I can add/subtract bits?

I've found this:
http://geekexplains.blogspot.com/2009/05/binary-rep-of-negative-numbers-in-java.html
But I don't understand step #3

edit:

Is this right?

Byte 0000 0100 = 4
Build 1s complement = 1111 1011
Build 2s complement adding 1, in binary 0000 0001

So we have:

If every number, positive and negative is signed by the leftmost bit and the rest of the bits represent the value using two's complement notation. So why is int i = 4 not the two's complement of 0000 0100? If 0000 0100 is already the 2s complement, then the "normal" value would be 0000 0100 - 0000 0001 = 0000 0011 --> invert it = 1111 1100, but that is 252. I'm totally confused.

How is '4' really represented in Java as binary? 0000 0100 or the 2s complement of this?

greez
Sam

Stephan van Hulst
Bartender
Posts: 5432
52
In two's complement, the negative of a value is calculated by first inverting all the bits, and then adding 1.

So for a byte value 4 (0000 0100), the negative is the one's complement (1111 1011) and then add 1 (1111 1100): -4.

Subtraction is actually performed by *adding* the two's complement of the subtrahend to the minuend. For instance, let's subtract 5 from 14:

Campbell Ritchie
Sheriff
Posts: 48454
56
There have been at least two threads discussing formats of integer numbers very recently: 1 2.

And welcome to the Ranch

Sam Samson
Ranch Hand
Posts: 63
Ok, now it's clear how I calculate the 1s and 2s complement. But how does Java store the byte 'b' internally:

as 0000 0100 or the 2s complement of this that would be 1111 1100? But 1111 1100 is negative. Or is the 2s complement calculated without the leftmost bit?

The output of the sysout is 100, and that's not the 2s complement.

Stephan van Hulst
Bartender
Posts: 5432
52
When we say "Java uses two's complement binary representation", it doesn't mean Java always takes the two's complement of every number. It means positive numbers are stored as usual in the binary system, and negative numbers are stored as the two's complement of their positive counterpart.

4 is always stored as 0000 0100, and -4 is always stored as 1111 1100.

Stephan van Hulst
Bartender
Posts: 5432
52
To be clear, my posts are in short-hand. -4 is actually stored as

11111111 11111111 11111111 11111100, but that's a bit cumbersome to explain. (byte) -4, however, is stored as 1111 1100

Campbell Ritchie
Sheriff
Posts: 48454
56
Welcome to the Ranch

Did you see that I pointed out in those other threads that in two’s complement exactly half the numbers available are negative? 0000_0100 is one of those which are not negative. When you call the binary string method, it misses out the leading 0s, returning "100".

Sam Samson
Ranch Hand
Posts: 63
Stephan van Hulst wrote:When we say "Java uses two's complement binary representation", it doesn't mean Java always takes the two's complement of every number. It means positive numbers are stored as usual in the binary system, and negative numbers are stored as the two's complement of their positive counterpart.

4 is always stored as 0000 0100, and -4 is always stored as 1111 1100.

Aha, that's very interesting. In my SCJP-Book is written:
All six number types in Java are made up of a certain number of 8-bit bytes, and are signed, meaning they can be negative or positive. The leftmost bit is used to represent the sign, where a 1 means negative and 0 means positive. The rest of the bits represent the value, using two's complement notation.

There's no clue, that only negative values are represented by the two's complement notation.

Why a byte -4 isn't 1000 0100? Why is it easier(?) to build the 2s complement and not just to invert the leftmost bit of a positive binary to get the negative of it?

@CR:

Stephan van Hulst
Bartender
Posts: 5432
52
The book is wrong, or at least misleading. The entire number is written in two's complement representation, not just the remaining bits.

Note that a consequence of two's complement is that *all* negative numbers have the sign bit set, and *all* positive numbers have the sign bit cleared. So regardless of whether you simply flip the first bit, or use two's complement, you will always be able to tell the sign of the value by just looking at the first bit.

An advantage of storing values this way, is that there is only one representation for 0. If you flip only the sign bit for negative values, there are two representations for 0:
0000 0000 and 1000 0000. 0 and -0 respectively, which really are the same value. In two's complement notation, you can represent one more value instead: -128 (for byte values).

A second, and even bigger advantage is that you can subtract values in the way I showed you earlier, by just adding the complement of the subtrahend to the minuend. So you don't have to create a completely different operation for subtraction, you can just reuse the addition. This is what most processors do internally as well.

Sam Samson
Ranch Hand
Posts: 63
Thanks a lot for your explanations

Campbell Ritchie
Sheriff
Posts: 48454
56
Another mistake in the book. There are not six primitive number types, but seven. The seventh is called char. Unfortunately that link is not behaving well at the moment.

Campbell Ritchie
Sheriff
Posts: 48454
56
Stephan van Hulst wrote:. . . If you flip only the sign bit for negative values, there are two representations for 0 . . .
That is called S&M (sign and magnitude) format.

Sam Samson
Ranch Hand
Posts: 63
Ok, but why does this throw a NumberFormatException?

There are 32 '1', the first should be the sign-bit, shouldn't it?
Ah nice, valueOf() looks for a '-' in the String, so it doesn't look at the sign bit as described in the method JavaDoc That can't be, something I must do wrong.

Henry Wong
author
Marshal
Posts: 20907
76
Sam Samson wrote:
Ah nice, valueOf() looks for a '-' in the String, so it doesn't look at the sign bit as described in the method JavaDoc That can't be, something I must do wrong.

May we ask... where in the javadoc does it describes this method as processing the sign bit?

Henry

Campbell Ritchie
Sheriff
Posts: 48454
56
You need to look back at valueOf(), and see what it can parse. It refers back to parseInt. Now you can tryYou see, all integer literals are unsigned. So you can pass those 32 1s, which are taken directly into the memory, and when you print it, you get -1.
Similarly hex literals are unsigned, so you can pass 0xffffffff and also get -1.
Similarly decimal literals are unsigned, but they are restricted to the range 0 to 2147483648. If you try to parse your 32 1s, you get 4294967295, which is outside the permissible range for an int. What you can try is this:Obviously without the (int) cast you get 4294967295.