This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Originally posted by Francis Palattao: still a bit confused how float has a longer range.
Can you explain by using bits, because I see long should be longer range because it has 64 bits while float is only 34 bits.
Take this floating point number: 1.2345e55
This reads 1.2345 times 10 to the 55th power. This stores perfectly into a floating point because it stores the 12345 as bits, and 55 as bits.
Unfortunately longs do not store this way. This value will not store into a long because it is 56 digits long. Basically, 12345 followed by about 51 zeros.
On the plus side, longs have full precision within its range. Floats do not. While floating point numbers can get really really big -- they are generally full of zeros when they get big. In the previous example, there were 51 zeros.
Opps... sorry, I think I picked an example that was too big even for a float, but you should get the point.
Joined: Mar 13, 2004
You need to know the difference between precision and range.
pi is 3.14 with a precision of 3 significant figures. pi is 3.14159 to 5 significant figures.
The speed of light is 3,000,000 kilometers/second to 1 significant figure. There are 7 digits in that number, 6 of which are not significant.
In scientific notation, the speed of light is 3 * 10^6 kilometers/second.
Avogadro's number, the number of atoms in a gram of hydrogen, is 6.022 * 10^23. That is not an exact number, it is accurate to 4 significant figures. To use this number in a computer, we store the fraction (6.022) and the exponent (23) as separate binary numbers within a 32 or 64 bit memory word.
Since some of a 32 float variable is used for the exponent, the fraction, which holds the significant figures, is even smaller than a 32 bit int.
As a result, a float can be very large, as in 3 with 38 zeros after it, or very small, as in .000<38 more zeros>00003, but it won't be very accurate.
A long can't be nearly that big, but whatever value it has is accurate to the nearest integer.
So a float has a much bigger range than a long, but much less precision. And Java only looks at range in deciding if a conversion is permitted without an explicit cast.
From a casting and conversion perspective, a 32-bit float is considered "wider" than a 64-bit long. No explicit cast is needed to convert a long to a float because the value fits within range. However, the programmer must consider potential loss of precision whenever working with floating point values.
Under IEEE (Institute of Electrical and Electronics Engineers) 754 standards, floating point numbers are stored as binary fractions and exponents rather than decimals. A 32-bit float is stored as 1 sign bit, 8 exponent bits, and 23 mantissa bits. A 64-bit double is stored as 1 sign bit, 11 exponent bits, and 52 mantissa bits. (To prevent the exponent from being stored as a negative number, a constant "bias" is added to the actual value: floats have a bias of 127, and doubles have a bias of 1023.)
This provides a "sliding window" of precision appropriate to the scale of the number. However, if a value can't be represented in terms of binary fractions (i.e., a summation of powers of 2) within this window, then it loses precision.
The code below illustrates loss of precision. Three literal longs are automatically converted (through assignment) to type float. But when these are explicitly cast back to type long, the first two quantities -- which originally differed by 274877906943 -- are equal. And the third quantity -- which was originally only 1 less than the second -- now differs from its original value by 274877906943.
Upon re-reading my response above, I see that I provided a lot of information that probably isn't very helpful. Let me try this...
First, consider that whenever we are working with floating point numbers, we are going to have to accept approximations. The reason is that many values have infinite decimal representation -- either with a repeating pattern (for example, 1/11 = 0.090909... or 1/3 = 0.3333...), or with an irrational, non-repeating pattern (for example, pi = 3.14159...). From a practical standpoint, we have to cut these representations off somewhere; and as soon as we do, we have an approximation. Or, in other words, we lose precision.
So, for the sake of a simple illustration, let's say that we decide to cut them off at the 3rd decimal place (without rounding). That is, we store 1/3 as 0.333, and 1/11 as 0.090, and pi as 3.141. None of these values are exact anymore, but we can easily store them. We're traded precision for (some degree of) practicality.
Under this standard (assuming a decimal after the first digit), our range is only 0.000 to 9.999. So to increase range, let's add just a few more digits and use scientific notation. For simplicity, we'll use base-10 (although in a computer, this would be binary). Now, when we store a 7-digit number of 1234567, we'll understand this to mean 1.234 x 10^567. Suddenly, we've greatly expanded our range. But the trade off is in precision: Most of these digits are just place-holding zeros (implied by the exponent) to convey magnitude rather than an exact quantity.
This is great for really big numbers, but what about really small numbers? Well, suppose we agree that this 3-digit exponent will automatically have a "bias" of 500 built into it. In other words, we'll always subtract 500 from whatever value is stored. So if we want an exponent of 234, we'll store 734. Why is this helpful? Because this allows us to imply negative exponents. If we store an exponent value of 123, then subtracting 500 will give us -377. Recall that a negative exponent will "move" the decimal to the left, so now we can represent extremely small numbers, as well as extremely large numbers.
We'll add one more refinement: A new digit at the beginning to indicate sign, with 0 indicating positive and 1 indicating a negative.
So now in a simple 8-digit representation, we can store numbers as small as (+/-) 1.000 x 10^(-500) or as large as (+/-) 9.999 x 10^499. So we've got an enormous range to work with -- far more than what we would have with any simple 8-digit representation of a whole number -- BUT our precision is limited to those 4 digits that aren't the sign or the exponent.
Those 4 digits represent our "window" of precision -- the only place where we know the values are exact. Depending on the exponent, this window can "slide" either far to the left of the decimal point to imply very large quantities, or slide far to the right of the decimal point to imply very small quantities. But we're always going to have approximations unless the non-zero digits of our value can "fit" in that window of precision.
These are some of the basic ideas behind IEEE 754 standards for floating point numbers. The actual implementation is more complex, but hopefully this illustrates the trade-off between range and precision. [ December 18, 2004: Message edited by: marc weber ]