Big Moose Saloon
 Search | Java FAQ | Recent Topics Register / Login

Assigning long value to float?

May Pat
Ranch Hand

Joined: Jul 01, 2002
Posts: 32
Could anyone explain why don't we need explicite casting when assigning long value to a float since long is 64 bits and float is only 32 bits?
Thank you.

May P.
Dan Chisholm
Ranch Hand

Joined: Jul 02, 2002
Posts: 1865
Section 5.1.2 of the Java Language Specificationhas the following to say.
Conversion of an int or a long value to float, or of a long value to double, may result in loss of precision-that is, the result may lose some of the least significant bits of the value. In this case, the resulting floating-point value will be a correctly rounded version of the integer value, using IEEE 754 round-to-nearest mode (�4.2.4).

[ August 07, 2002: Message edited by: Dan Chisholm ]

Dan Chisholm<br />SCJP 1.4<br /> <br /><a href="http://www.danchisholm.net/" target="_blank" rel="nofollow">Try my mock exam.</a>
Valentin Crettaz
Gold Digger
Sheriff

Joined: Aug 26, 2001
Posts: 7610
May,
an important thing to remember when a long is converted to a float (widening conversion) is that there is no loss of magnitude but there may be a loss of precision due to the rounding of the least significant bits. Loosing some precision is regarded as less important than loosing orders of magnitude. That's why this conversion does not require any cast. What we want to keep is the magnitude of the long (reflected in the exponent of the float) and the maximal precision of the least significant bits (reflected in the significand of the float).
According to IEEE 754, a float is represented as the following bit pattern:
0 00000000 00000000000000000000000
bit 0 : sign bit
bits 1-8 : exponent (stores the magnitude in powers of 2)
bits 9-31 : significand (stores the precision)
For instance, let's take
long big = 1234567890L;
float f = big;
System.out.println(big - (long)f);
Output: -46
Let's decompose the long and represent it as a float. The maximal magnitude (in powers of two) for "big" is 2^30 (1,073,741,824). Then let's take "big" and subtract the above magnitude and we are left with the remaining precision. (1,234,567,890 - 1,073,741,824) = 160,826,066
160,826,066 has to be stored in the 23 bits of the significand, but the binary representation of 160,826,066 is 1001 10010110 00000010 11010010 which takes 28 bits. Some least significant bits will be discarded but the overall magnitude (2^30) is kept.
[ August 07, 2002: Message edited by: Valentin Crettaz ]

SCJP 5, SCJD, SCBCD, SCWCD, SCDJWS, IBM XML
[Blog] [Blogroll] [My Reviews] My Linked In
Bindesh Vijayan
Ranch Hand

Joined: Aug 21, 2001
Posts: 34
Thanks Val,
Where can I get more information on floating point numbers and their representation in Java?
Valentin Crettaz
Gold Digger
Sheriff

Joined: Aug 26, 2001
Posts: 7610

I agree. Here's the link: http://aspose.com/file-tools

subject: Assigning long value to float?