This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.

Hi,
I am developing a application, which has some int manipulations.
Thing is i am getting input values as 31958379002001000, 31958379002001000, 31898379002001000, 31828379002002000, 31828379002002000 etc.
they are not fit in int. but i must use int only.

so, I want to generate a distinct unique number for each of the big input number shown above and display them in response in some how. and the resultant number should again be fit in int size.

sai srinivas jonnalagadda wrote:that is right. but I need a good approach of generating unique numbers.
i don't want to increment values from 0 blindly.

Why not! They are numbers and they are unique. There is obviously more to your requirement then you have said.

Why?? As long as you increment every time you add you will always have a unique id. Database' often use this scheme for IDs

Hunter

"If the facts don't fit the theory, get new facts" --Albert Einstein

sai srinivas jonnalagadda
Greenhorn

Joined: Jul 27, 2011
Posts: 9

posted

0

problem is i am doing some database operations with those numbers. the maximum limit for that column is 2power32 -1.. i cannot change the data type for that column
so, I must convert them uniquely into int sized values, so that i can insert them in tables.
I want a algorithm kind of thing which generates random unique number based on the digits of the input number.

sai srinivas jonnalagadda wrote:
problem is i am doing some database operations with those numbers. the maximum limit for that column is 2power32 -1.. i cannot change the data type for that column
so, I must convert them uniquely into int sized values, so that i can insert them in tables.
I want a algorithm kind of thing which generates random unique number based on the digits of the input number.

There are an infinite number of large numbers and only 2^32 ints so you cannot possibly make the mapping unique just by simple manipulation. Since you already have a database then why not just use a database table mapping the big integers to the small integers. You can use auto-increment when inserting the big numbers.

sai srinivas jonnalagadda
Greenhorn

Joined: Jul 27, 2011
Posts: 9

posted

0

no james.
because this logic is needed at many places where DB operations are not there.
I found one solution.
divide the input number by some big number (assumed to be maximum value of int) and then concat both quotient and reminder which is definitely be a unique number.
but the problem is that the new generated number itself is might be a big number larger that int.
so i stuck over here.
i want similar kind of algorthims

sai srinivas jonnalagadda wrote:no james.
because this logic is needed at many places where DB operations are not there.
I found one solution.
divide the input number by some big number (assumed to be maximum value of int) and then concat both quotient and reminder which is definitely be a unique number.

How can it be? There are an infinite number of big numbers and only 2^32 ints! You cannot make any simple manipulations of a big number that will create a unique int!

sai srinivas jonnalagadda
Greenhorn

Joined: Jul 27, 2011
Posts: 9

posted

0

you are right..
as long as the input number is same.. that combination remains same.
that is that my problem. the resultant number can be again a big number.

You need to have a mapping from something that potentially has an infinite number of values to something that has a limited (2^32 - 1) number of values. There is absolutely no way you can get an algorithm that guarantees uniqueness. Eventually you will get a duplicate.

That said, the auto-increment from 0 (or 1, as databases do) is going to be as good as it gets, as long as you have proper synchronization. Using pessimistic locking it's really easy:
Now you won't run into any problems until you encounter more than 2^32 values*, but given the limited number of allowed IDs you'll run into that problem no matter what algorithm you choose.

* Theoretically that is. In practice you'll run out of memory long before that happens with this solution.

What are the possible input values? are they only in a certain range, or can they be any size/length? with they always be 18 characters long (or however long those values are)?

It is not clear to me on my scanning this thread what the allowable inputs are. if they are limited somehow, it may be possible to come up with a mapping. if they are unlimited, it is not. it's a simple as that.

Also, I'm not sure your "divide by a large number and concat the result and remainder really works. For example, if my input was "12345678901234567890", and I divide by 9876543210, I get 1249999988 remainder 7253086410, which is the same number of digits, saving me nothing.

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors

sai srinivas jonnalagadda wrote:the input number contains maximum of 17 digits .

And what is the minimum? Do leading zeroes matter?

Basically, you need to know:

how many possible input values are there? then it's easy. If the possible inputs are more than 2^32, it is impossible to GUARANTEE a unique mapping. If there are less, it can be done.

For the few example input values you've given I see a small pattern - they all end with 002001000 or 002002000. Is my guess correct that they end with 0020XX000, with 20XX being a year? If so, you can already get rid of the 0020 and 000. The resulting values will have 10 characters, just as 2^31 - 1 (Integer.MAX_VALUE). The input values also all start with 318 or 319 - perhaps you can also strip off the leading 31.

I'm assuming the 8379 inside the numbers cannot be removed in a similar way. I fear my thinking is a bit too naive, and the actual values will be a lot different.

Oh, and:

sai srinivas jonnalagadda wrote:Hi,
I need it very urgent.

Please EaseUp. There is no such thing as "urgent" around here.

sai srinivas jonnalagadda
Greenhorn

Joined: Jul 27, 2011
Posts: 9

posted

0

the possible input values are definitely less that 2^31, that is for sure.
and it is not case that every number contains 0020XX000,.
minimum and maximum digits in the input number is 17. so the input is a 17 digit number.

if i get duplicate number(by making some comparisons like putting in the hash set something), i will run the algorithm until i get new number.

Can you please explain again why you cant just assign ascending numbers from 1 to 2^32?
I still dont understand why you cant use this most obvious solution.

At first i thought you want an algorithm that lets you calculate the unique ID from the input, so you dont
have to store the ID somewhere. All you would need is the input and you could always calculate the ID directly
from that, without having to access any more stored values.

But if you - as mentioned in you last post - run the algorithm and check if the calculated ID is already taken, it makes no sense, because for this check, you need to compare you calculated ID with stored
ones from other inputs, right?

sai srinivas jonnalagadda wrote:the possible input values are definitely less that 2^31, that is for sure.

Unless you can tell us about a pattern in the input values, how can you be sure?

minimum and maximum digits in the input number is 17. so the input is a 17 digit number.

Even if only numbers from 1-9 were allowed, without a pattern there would be 9^17 possible values. That's still over 3.8 million times more than 2^32. Allowing 0 only increases this number.

You said that there are definitely less than 2^31 values. That means that you know what the possible values are going to be. That means that there must be a pattern in there. We need to know that pattern if we can make the mapping fit.

sai srinivas jonnalagadda
Greenhorn

Joined: Jul 27, 2011
Posts: 9

posted

0

actually, I can use the logic of generating sequence numbers 0,1....
but the same logic needs to implemented at some other place.
if i run this logic there, i cannot retain the generated sequence, there is no guarantee that the input numbers will map to the same generated number unless the input number sequence is consistent.

my intention is for a given input number, the outcome should be same irrespective of wherever we run the logic and also irrespective of the sequence of input.

the reason for why i mentioned about comparison of numbers is because, once we get the logic, we can overcome the shortcomings.

Ok so you have 2 different places that both have the same input-numbers, but they cant share data.
And now you need an unique ID to identify the same input in both places (but the places as mentioned cant
share the ID, each place has to generate the same id for the same input by itself).

The most obvious but maybe time/memory/CPU consuming way to do this would be to sort he inputs and give
the smallest input the ID '0', the second smallest input the ID '1' and so on. That way both places have the same ID
for the same input and every ID is unique.

sai srinivas jonnalagadda wrote:if i get duplicate number(by making some comparisons like putting in the hash set something), i will run the algorithm until i get new number.

If you have an algorithm, wouldn't running the same number through again just give you the same result you got the first time?

If you are doing a random number generator, how are you going to get the same number in 'that other place'? Further, using a RNG works great at first, but the more values you use, the slower it gets. towards the end, it may take a LOOOONNNGGGG time to find a unique value. Now, granted, with 2^32 inputs, you may never get there, but it is still a design consideration.

. . . but if you really have 9^17 inputs, you will eventually reach 2^32. But your program will grind to a halt before that while the random generator desperately seeks unused numbers.

Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 43950

33

posted

0

You would have more success with longs, which use 64 bits. But remember it can take a long time to check whether a "random" number is unique, once you have lots of records in your database.

As people have told you, 1, 2, 3, 4, etc is a very effective and efficient way to generate unique numbers.

I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com

subject: how to generate a unique number from a series of numbers