Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to Create a Unique 16 Character String

 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I'd like to write some code to generate a unique 16 character sting (not UUID). The string must be very hard for someone to guess...

Any suggestions?

Thanks
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15281
39
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could pick 16 random characters and put them in a string. But it would be hard to make sure that the string is unique. What exactly do you mean by "unique" - must it be globally unique (like an UUID), which means that there is an astronomically small chance that the same string will ever be generated twice, or should it be different from all strings in some dataset that you have (in a database, for example)?
 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for replying.

It must be Globally Unique within the organisation like a UUID but only 16 characters long.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt McDonald wrote:Thanks for replying.

It must be Globally Unique within the organisation like a UUID but only 16 characters long.


If it must be unique (that is, even the astronomically small chance of duplicates is unacceptable), and it can't be guessable, then your only real options are:

1) Randomly generate 16 characters, and then compare against a list of already generated IDs, and if it's a duplicate, try again.

2) Use a decent pseudorandom algorithm, like what java.util.Random uses to pick the next in the sequence based on the most recently generated, and seed it with something random. In this case, if somebody knows the algorithm and knows what's been generated so far, he can predict what comes next. I don't really consider this a viable alternative, but I figured I'd throw it out there for grins.

But I have to ask:

1) Does it really have to be unique? Do you understand what it means for there to be a 1 in 2^128 chance of collision between any two, and how many you can produce before you have even a 1 in a million chance?

2) Why can't it be UUID?
 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks, you've given me something to think about, I didn't think about checking whether the ID had been created previously. The process of issuing an ID needs to be very quick so checking the existence of a previous id may have performance issues for me.

In answer to your questions:

1) It nees to be unique because it will represent a paricular event in a workflow system. We can't have two events with the same ID.
2) I've been told to implement 16 characters, that decision is out of my hands.

Thanks
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt McDonald wrote:Thanks, you've given me something to think about, I didn't think about checking whether the ID had been created previously. The process of issuing an ID needs to be very quick so checking the existence of a previous id may have performance issues for me.


If you store previously issued IDs in a HashSet, it will be very quick. Even if you have to go to a DB every time, I can't imagine that would be a problem. You're not generating thousands of these things per second, sustained, are you?


1) It nees to be unique because it will represent a paricular event in a workflow system. We can't have two events with the same ID.


Using UUID, the odds of a collision are ridiculously small. I forget the exact values, but it's something like generating a thousand a second for a thousand years gives less then a 1 in a million chance of collision. See http://en.wikipedia.org/wiki/Birthday_problem for details.

Basically, the odds of a collision are much smaller than the odds of a serious failure due to a bug in your code or a hard disk crash or some other mundane catastrophe.

2) I've been told to implement 16 characters, that decision is out of my hands.


Which characters are you allowed to use? That is, how many bits of entropy will you have?




 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Any and all characters can be used :-)
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt McDonald wrote:Any and all characters can be used :-)


So, the entire Unicode set then. Trimming down to the older, smaller unicode, we have 16 characters at 16 bits per character--256 bits. UUID has 32 characters at 4 bits/char, so 128 bits. Therefore, you can easily encode a UUID into your character set.
 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok thanks, I'll look into it.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To be clear: You're saying this would be a valid ID:
語% औئ¾Ю⟰◪∴ᦌᛯᏌᄫആʘ
 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, alpha/numeric characters only....
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt McDonald wrote:Ok, alpha/numeric characters only....

Okay, so... A-Z a-z 0-9, yes?

I'm trying to get precise requirements from you, since one can't really choose an approach without understanding the requirements.

If this is your requirement, then that's 62^16, which we can round up to 64^16, which is (2^6)^16, which is 2^96, so an upper bound of 96 bits, which means we cannot fit a UUID (128 bits) without loss. So, barring further constraints, I'd just go with generating 16 random characters. It's probably no cryptographically secure, but at this point I'll assume it's good enough for your needs. Only you can answer that for sure though.
 
Campbell Ritchie
Sheriff
Pie
Posts: 48954
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
2 ^ 128 = 340282366920938463463374607431768211456 which has 39 digits
62 ^ 16 = 47672401706823533450263330816 which has 29 digits
256 ^ 16 is the same as 2 ^ 128.
So you would have to look for 256 different characters in your 16-character String. Of course you can use accented letters (as in French, Spanish etc), Greek and Russian letters if you wish. Beware: things like the Greek capital Α (α) and Russian capital Ah look exactly like A.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:2 ^ 128 = 340282366920938463463374607431768211456 which has 39 digits
62 ^ 16 = 47672401706823533450263330816 which has 29 digits
256 ^ 16 is the same as 2 ^ 128.
So you would have to look for 256 different characters in your 16-character String.


@Matt: To be clear, this is what you would have to do if you wanted your 16-char String to have as much entropy as a 128-bit UUID. If your requirement is simply 16 alphanumerics with whatever entropy that gives you, then these numbers are only of academic interest.

@Campbell: I don't think that's actually his requirement. He started off asking for 16 random characters, and said he couldn't use UUID. I'm the one that delved further down the UUID/128-bit path.
 
Campbell Ritchie
Sheriff
Pie
Posts: 48954
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In which case you have already given the correct answer, yesterday, ie create 16 chars and test your String against a set.
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeff Verdegan wrote:@Matt: To be clear, this is what you would have to do if you wanted your 16-char String to have as much entropy as a 128-bit UUID.

Actually, a UUID is directly convertible to an array of 16 bytes, which could then be used to construct a String.
Providing Matt takes some care to ensure that a consistent character set (UTF-16?) is used to encode, it should be reasonably easy to make sure that (a) the result is exactly 16 characters long, and (b) it can be decoded precisely if required.

Winston
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Jeff Verdegan wrote:@Matt: To be clear, this is what you would have to do if you wanted your 16-char String to have as much entropy as a 128-bit UUID.

Actually, a UUID is directly convertible to an array of 16 bytes, which could then be used to construct a String.


Only if he has 256 different characters he can use. He stated he's restricted to alphanumerics though, so presumably [A-Za-z0-9], which is somewhat less than 256.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:In which case you have already given the correct answer, yesterday, ie create 16 chars and test your String against a set.


Actually, Jesper suggested it in the first reply. I merely reiterated it in the event that UUID was well and truly out.
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeff Verdegan wrote:Only if he has 256 different characters he can use. He stated he's restricted to alphanumerics though, so presumably [A-Za-z0-9], which is somewhat less than 256.

Ah. Missed that. Must learn to read.

If you don't want to write a lot of code, it seems to me that two lots of (pseudocode)
Long.toString(36^7 + (Random.nextLong() % ((36^8)-(36^7))), 36);
would do the trick, unless he must use both upper and lower case.

Winston
 
Campbell Ritchie
Sheriff
Pie
Posts: 48954
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeff Verdegan wrote: . . . . Actually, Jesper suggested it in the first reply. . . .
Apologies to both of you; like WG I ought to learn to read!
 
Matt McDonald
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks everyone for your comments, you've given me lots of ideas to investigate, better go and build something now....
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic