File APIs for Java Developers Manipulate DOC, XLS, PPT, PDF and many others from your application. http://aspose.com/file-tools
Big Moose Saloon
 Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies Register / Login

# Floating Point Arithmetic: Help with denormalized numbers

Edwin Dalorzo
Ranch Hand

Joined: Dec 31, 2004
Posts: 961
I have been working a little bit hard in understanding IEEE 754 which is the standard followed by the Java Virtual Machine to treat floating point numbers and operations.

Now I will have to give a little explanation of what I am trying to do in order that you guys understand my questioning. So just be patient with me... I assure you that you might get interested in understanding this as much as I do if you already do not understand it.

First of all, lets convert a floating point number to its binary form representation:

Let's use simple float number: 84.75 (For all calculations I will use a Java float data type).

1. Ok, the number 84 in base 10 is equal to 1010100 in base 2. It's a simple conversion and I know most of you know how to do it.

84 = 42 x 2 + 0
84 = (21 x 2 + 0) x 2 + 0
84 = (((10 x 2 + 1) x 2 + 0) x 2 + 0)
84 = ((((5 x 2 + 0) x 2 + 1) x 2 + 0) x 2 + 0)
84 = (((((2 x 2 + 1) x 2 + 0) x 2 + 1) x 2 + 0) x 2 + 0)
84 = ((((((2 x 1 + 0) x 2 + 1) x 2 + 0) x 2 + 1) x 2 + 0) x 2 + 0)
84 = 1010100

2. Now the number 0.75 expressed in base 2 is 0.11 . I know you know how to do it, so just forgive me I insist in writing the procedure. It just helps me to set everything clear.

0.75 * 2 = 1.5
0.5 * 2 = 1.0

You can test this is true if you resolve this expression 1 x 2e-1 + 1 x 2e-2 = 0.75

3. So 84.75 in base 10 is equal to 1010100.11 in base 2.

4. Now if in base 10 we can express a number in scientific notation

84.75 is equal to 84.75x10e0 is equal to 8.475x10e+1

5. Then we can also say that...

1010100.11 is equal to 1010100.11x2e0 equal to 1.01010011x2e+6

6. Now the difficult part to explain is the IEEE 754 floating point number anatomy.It is somewhat like this

[ Sign [31] Exponent [23-30] Mantisa [00-22] ]

7. Wich means that the first 22 bits are the fraction. For example:

8.475x10e+1 the fraction number is .475.

8. The next 8 bits are the exponent.

In the number 8.475x10e+1 the exponent is 1.

However in our floating point number the exponent is 6 as you can see above (1.01010011x2e+6).

9. The most significant bit represents the sign (1 means negative)

10. So our exponent is 6, however as the exponent part of the floating point anatomy has to be able to express both negative and positive numbers, the standard says that exponent number is biased by 127. That means that you must add to your exponent number 127. This way numbers over 127 means positive exponent, and numbers below that number means negative exponent.

11. That means that our exponent should be 127+6, that is 133 which in binary format is 10000101

12. So our floating point number is: 0 10000101 01010011000000000000000

13. Which you can see is formed by a positive sign (31-bit is clear) the next 8 bits are the exponent number (6+127), that's to say 133 (10000101) and the next 22 bits are the fraction number filled with 0s by the right (01010011000000000000000)

14. This number expressed in hexadecimal is 0100 0010 1010 1001 1000 0000 0000 0000 = 4 2 A 9 8 0 0 0

15. We can test this in java by means of this code

16. Now, based on this it is very simple to understand the IEEE 754 special values:

17. Positive zero and negative zero are built by means of turning on and off the sign bit and an exponent field of zero and a fraction (mantisa) field of zero.

For example:

Positive zero = 0000 0000 0000 0000 0000 0000 0000 0000 = 0x0
Negative zero = 1000 0000 0000 0000 0000 0000 0000 0000 = 0x80000000

18. Positive infinity is an exponent of all 1s and a fraction (mantisa) of all 0s.

For example:

+Infinity = 0111 1111 1000 0000 0000 0000 0000 0000 = 0x7f800000
-Infinity = 1111 1111 1000 0000 0000 0000 0000 0000 = 0xff800000

19. NaN (Not-a-Number) An exponent of al 1s and a Non-Zero fraction (mantisa)

Quiet NaN: With most fraction significant bit set (in intermediate operations)
Signaling NaN: With most fraction significant bit clear (in invalid operations)

QNaN = 0111 1111 1100 0000 0000 0000 0000 0000 = 0x7fC00000
SNaN = 0111 1111 1010 0000 0000 0000 0000 0000 = 0X7fA00000

20. However there is another kind of special numbers, the denormalized numbers. And here it is where my question comes out.

21. Denormalized numbers are exponent all 0s, but fraction is Non-Zero.

22. For example, in Java the float 5.877472E-39f is a denormalized number.

23. But, what are they for?, how do I make conversions between a base 10 floating point number and this format?, how do I convert them back?, and when does the jvm use this kind of numbers?

24. I know that this level of detail must probably will not appear in the SCJP 1.4 or 1.5. However I am not taking the exam just for the certification, I really want to know... you know understand everything very well.

Does anyone knows the answer or can help me find it?

Thanks in advance, you all are great!
[ January 18, 2005: Message edited by: Edwin Dalorzo ]
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Excellent question!

A value in binary scientific notation is 1.xxx... times 2 raised to some power. That is, the first non-zero digit is always "1". So in a normalized mode, the non-fractional "1" before the mantissa is not stored as part of the value -- it's implied. This is sometimes called the "hidden bit." (You can see this in your example above. The significand bits only include the "fractional" portion -- the mantissa.)

But a denormalized value has no implicit "1" before the mantissa. Instead, a denormalized (or "subnormal") value is understood to be 0.xxx... times 2 raised to some power. This provides an extended range of very small numbers, and it comes at the expense of gradually losing precision as the first "1" bit moves farther to the right (leaving less room for significant figures).

In general terms we have...

Normalized (with a "hidden" bit in the significand):
(-1)^(sign bit) * 2^(exponent - bias) * (1 + fractional mantissa)

Denormalized (with no "hidden" bit):
(-1)^(sign bit) * 2^(-bias + 1) * (fractional mantissa)

Ref:
http://en.wikipedia.org/wiki/Floating-point (see esp. "Hidden bit")
http://en.wikipedia.org/wiki/Denormal
http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html

(These details are definitely not on the SCJP exam. )
[ January 19, 2005: Message edited by: marc weber ]

"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

A couple more links...

Here's a friendly site on IEEE 754 :
http://www.public.iastate.edu/~sarita/ieee754/homepage.html

And here's a good technical source :roll: :
http://docs.sun.com/source/806-3568/ncg_goldberg.html
[ January 19, 2005: Message edited by: marc weber ]

I agree. Here's the link: http://aspose.com/file-tools

subject: Floating Point Arithmetic: Help with denormalized numbers