Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!

# Large Numbers

Jignesh Malavia
Author
Ranch Hand
Posts: 81
The largest number that the long data type (64 bits) can handle is 19 digit Long.MAX_VALUE. The largest number that the double data type (64 bits) can handle is much larger Double.MAX_VALUE but loses its precision. Write a program that can add and subtract arbitrary large numbers, say 50 digits to 1000 digits long.
Conditions:
1. The most important condition is to use as little memory as possible
2. Try to design such that the performance does not degrade drastically with increase in the number of digits
3. The numbers can be non-integral with a period as in example #2 below
4. Michael and Jim cannot start coding before next 24 hours ;-)
Examples
------------------------------------------------------
Input:
123456789123456789123456789123456789
+
876543210876543210876543210876543211
Output:
1000000000000000000000000000000000000

------------------------------------------------------
Input:
111111111222222222.333333333444444444
+
777.777
Output:
111111111222223000.110333333444444444
------------------------------------------------------
Input:
333333333333333333.333333333333333333
-
244444444444444444.444444444444444444
Output:
088888888888888888.888888888888888889

David Weitzman
Ranch Hand
Posts: 1365
This seems to fit legally within the conditions... Specifically, I haven't checked the libraries I've used to see if they use the most memory-efficient techniques. I trust them to do a good job, though. I also haven't tried compiling or running this code. Here goes:
Usage:
java Adder [first number] [second number]

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Well, BigDecimal is designed to do a lot of different operations, and as a result it has some extra instance variables which are not really necessary if the objective is just to add and subtract decimal numbers of finite (but arbitrarily large) length. (Well, the variables are really in a BigInteger which is contained in BigDecimal.) Also BigDecimal is designed as immutable, creating a new object every time you add or subtract. That does't seem to be a requirement here. So to minimize memory usage, you can do better than BigDecimal. I think that the problem will probably be more fun, too, if you try to code without looking at the BigDecimal source.
Jignesh - when we minimize memory used, you're talking about memory for instance data, right? If we use more memory in local variables (i.e. temporarily while adding/subtracting) that doesn't count, does it? I'm not sure yet (since I haven't started coding) but it seems that the best scheme I can come up with to minimize instance data might require additional memory for temp variables, so I want to make sure this requirement is clear.
4. Michael and Jim cannot start coding before next 24 hours ;-)
In fact I'm about to hit the road to go to JavaOne, so you probably won't hear much else from me until much later. And Michael is busy today with his daughter's wedding. So it will be up to someone else to challenge David for supremacy here. Have fun, everyone.
[ June 08, 2003: Message edited by: Jim Yingst ]

Jignesh Malavia
Author
Ranch Hand
Posts: 81
David :-)
I should have said- if you had to design a library that provides features like that provided by the BigDecimal class how would you do it so that it is efficient memory/time wise?.
Jignesh - when we minimize memory used, we minimize the instance data, right? If we use more memory in local variables (i.e. temporarily while adding/subtracting) that doesn't count, does it?
Well, to be honest, I did not really think about memory in terms of instance verses local data because i did not even think of the problem as an OO class. All I was thinking was about a method that accepts two Strings and returns a String and whether we use memory from heap or from stack, how should we keep the total amount of memory consumed by the process to a minimum when that method is executing? But I see your point. We can start with using minimum memory for representation (instance data) and later work on the usage of memory for the operations (local data).
Some approaches- if we use a char[] to represent individual digits as chars, then a 50 digit number occupies 100 bytes. If we use a byte[], then it would use 50 bytes. So how about using packed Binary Coded Decimals (BCD) to pack two digits in a single byte. That would require only 25 bytes. But in this case, the arithmatic operations will involve more memory and calculations. Another approach would be to use an array of longs to represent large numbers. E.g an array such as "new long[2]" can be viewed as a 128 bit value, and "new long[3]" can represent 192 bits, and so on. The operations + and - will probably be easy to implement but converting from decimal String to 128 bits binary and vice versa might require some thinking.
Anyways, have a nice time at JavaOne.

David Weitzman
Ranch Hand
Posts: 1365
If we're assuming a single method where the input and output are both strings in base 10, than converting them to a more space efficient form doesn't change the fact that you still needed the memory for the char array/String. In fact, I'd guess that it's probably slower to convert them to a different form, do the addition, and convert them back.
On the other hand having a class named something like BigSummableDecimal that stored numbers in a compact form would make sense. You would be able to do several additions in sequence before it was necessary to convert back to a String for display or serialize it quite efficiently.
If I have a few moments late tonight I'll see if I can work out a solution that's a bit more like what you were looking for

Michael Morris
Ranch Hand
Posts: 3451
Ok, this thing is not nearly as easy as it sounds. At least I haven't figured out an easy way to do it. I first started out using a char[] for storage and was going to use 9 complement but that really started to get out of hand. I finally decided on a boolean array to hold the value, but even that started growing hair and warts. There are a lot of kludges in this code and I cheated a little by using regexes and the BigInteger class to transform to and from binary. I meant to do my own translator for that, but I'm growing very weary of this. I have a job you know! Anyway here it is and please feel free to critique it. I'm not sure that it is 100%, so if you see a problem (perhaps a boundary condition I missed) then let me know.

Here are the results:

Jignesh Malavia
Author
Ranch Hand
Posts: 81
Michael, the code looks pretty good. I haven't tried running it yet (will do it later and let you know) but it seems you have put in lots of efforts there. Just for the sake of discussion, here's a minor thing I noticed. In the complementValueArray() method, why are you using the decrement() method and its 'borrow' logic for negative numbers? Whether it is positive or negative, the same complimenting code should work. Or am I missing something there? You can also get rid of the increment() method because incrementing by 1 is same as "Start from right most bit and go left flipping over all the 1 bits to 0 till you find the first 0 bit. Flip that 0 bit to 1 and stop there." Combining this increment logic with the 1's complment, we can do the 2's compliment as
Step1: Start from right most bit and go left "without" flipping over all the 0 bits till you find the first 1 bit.
Step2: Keep that first 1 bit as it is. Do not flip it.
Step3: Continue going left and flip the rest of the bits.

Rest of the code is done really well- the matching decimal point, adding, padding, and matching the bits, etc. Two Thumbs Up!
David: If we're assuming a single method where the input and output are both strings in base 10, than converting them to a more space efficient form doesn't change the fact that you still needed the memory for the char array/String.
True. :-) I forgot that part.

Michael Morris
Ranch Hand
Posts: 3451
Thanks for the critique Jignesh. This was my third attempt, as I said I first considered using a char[] array and 9's complement for negatives, then looked into double-packed BCDs in a byte array and then finally decided that a pure binary solution might be easiest. I kludged thru a lot of the code. It's been a while since doing any binary math so I just threw the increment and decrement methods together. If I had taken the time to do a truth table (or remembered a little more from the 70s) I might have come up with a more efficient complement method.
I would like to see someone try using a long[] array to do this. It wouldn't be a whole lot differnt my code but might be more efficient since there would be fewer iterations for complementing, etc.

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Congratulations, M^2. Yeah, the more I thought about it the more work it seemed like the whole thing would take. Gald someone else did it. One thing I note is that a boolean[] array is actually quite inefficient - on my machine (Win XP using SDK 1.4.2) it basically uses one byte per bit. Other array types use about what you'd expect, but booleans are not at all efficient. I'd suggest using a BitSet instead (I think). Or an int[] or long[] as you mentioned - I was leaning towards int[] because that made it a bit easier to work with overflows, as you could capture intermediate results in longs and then decide what to do with them. I doubt I'll actually get around to implementing it though. Again, well done.

Michael Morris
Ranch Hand
Posts: 3451
Another possibility for this would be to extend the IEEE 754 FP standard. That might be the simplest soloution of all. That really didn't occur to me until I read the chapter on that subject in Java Number Cruncher by Ronald Mak. I might look into it if I can find some time.

Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
Can I just use REXX?
I wonder how NetRexx does it. NetRexx translates REXX to Java then compiles, lets you mix syntax from both languages.

Ranch Hand
Posts: 925
just to go off in a tanget.
I was playing around with turning 10^n into binary using the BigInteger class. you get patterns appearing in the sequence of 0/1 as a function of n & power of 2.
10 = 1010
100 = 1100100
1000 = 1111101000
etc, the pattern starts of something like:
last n digits are zero,
n+1 digit = 1
n+2 digit = 0
n+3 digit = n >> 2 % 2
n+4 digit = n >> 3 % 2 ^ (n % 8) == 1 (???)
after that there is repetition, so for n+5 this pattern is repeated every 16n, n+6 - 32n, n+7 64n, n+8 128n, n+9 256n.
It's really made me start scratching my head to come up with a formula for the "mth binary digit of 10^n (base10)", and I'm determined to figure it out for myself (rather than google it).
I know someone came up with a similar formula for pi some years ago, & am starting to realise how hard it must have been!!

Jim Yingst
Wanderer
Sheriff
Posts: 18671
NetRexx: dunno, but I'm going to guess they just use BigDecimal.
IEEE 754: wouldn't that lead to roundoff errors? I think you're going to have a hard time getting answers exactly like Jignesh originally specified. The alternative of using a binary integer combined with a base-10 exponent (i.e. what M^2 did, and what I had planned to do, and what BigDecimal does internally) seems like the most memory-efficient choice that also gives exact answers for all decimal numbers of finite length. Methinks there must be an IEEE standard somewhere which describes this sort of format, as it seems a natural choice for exact decimal calculations.

Michael Morris
Ranch Hand
Posts: 3451
Originally posted by Jim Yingst:
NetRexx: dunno, but I'm going to guess they just use BigDecimal.
IEEE 754: wouldn't that lead to roundoff errors? I think you're going to have a hard time getting answers exactly like Jignesh originally specified. The alternative of using a binary integer combined with a base-10 exponent (i.e. what M^2 did, and what I had planned to do, and what BigDecimal does internally) seems like the most memory-efficient choice that also gives exact answers for all decimal numbers of finite length. Methinks there must be an IEEE standard somewhere which describes this sort of format, as it seems a natural choice for exact decimal calculations.

Would it be possible to extend IEEE 754 such that you calculate the number of bits required for both significand and exponent to get an exact result? Or would you always have numbers that fall between representations regardless of bit size? I really haven't done any serious calcultaions on it, just popped into my mind.

Jim Yingst
Wanderer
Sheriff
Posts: 18671
There will always be some decimal numbers that fall between representions, for IEEE 754-based representations. I suppose we could increase the precision dynamically to ensure that the calculations are done with more accuracy than is required for a given problem - e.g. if 8 digits past the decimal are required, then make sure that any error is in the 9th or better yet 10th digit. Then make sure you round the final answer accordingly so the user doesn't see that 9th or 10th digit. Offhand though I think it's more work to make sure that you're doing it just right that way, and you probably end up using a little more memory too.

Michael Dunn
Ranch Hand
Posts: 4632
Here's one, just for a laugh, using strings - so it's probably very inefficient.
It's not fully tested, needs error handling and additional code if negative numbers are entered, but it does seem to work OK with the samples given.
(also needs a bit of polish - e.g. I wasn't sure what the regex code for a . (dot) was, so I used a workaround)

Yahya Elyasse
Ranch Hand
Posts: 510
Hi guys,
I found the code written by Micheal very helpfull..
In fact i'm trying to solve this same problem (very large numbers arithmethics) but only for binary numbers.
one two important methods i want to write is shiftLeft() & shiftRight() for very large binaries.
can some one give me a clue on how to implement this ?
I would appreciate a lot some code .

thanks for helping.