This week's book giveaway is in the Agile and other Processes forum. We're giving away four copies of The Mikado Method and have Ola Ellnestam and Daniel Brolund on-line! See this thread for details.
I have a site with a cluster of web servers. Previous developers have these web servers generating a UID that is used for the primary keys of nearly every table in the database. The code for it is atrocious, collisions are common and will only get worse as time goes on. I need to replace it.
Here's the catch: I only have sixteen bytes to work with, encoded as ISO-8859-1. Note that this catch kills UUID or even java.rmi.server.UID dead in it's tracks. There's no space for it. Changing the size is not an option.
Here's the good news: it doesn't need to be a UUID or global at all, it doesn't need to be particularly efficient, it can be presumed to be called at most a few hundred times a second per server, all servers have, and always will have, an IP that is unique amongst those servers, and finally the lifetime is short (less than a decade), server time will never move backwards.
Here's what I came up with at first blush, keeping the above in mind.
Any holes (you see in the example) or suggestions are welcomed.
How could there be a lot of collisions with 16 bytes of 0-255? That's a really big quantity of possibilities, like 3x10E38 or something ridiculous like that. And how does ISO-8859-1 preclude the built-in UUID?
Ken Blair
Ranch Hand
Joined: Jul 15, 2003
Posts: 1078
posted
0
David Newton wrote:How could there be a lot of collisions with 16 bytes of 0-255? That's a really big quantity of possibilities, like 3x10E38 or something ridiculous like that.
You're assuming it's random. It's not even close, the code is really bad.
David Newton wrote:And how does ISO-8859-1 preclude the built-in UUID?
It was the size not the encoding I was referring to. Looks like I misread the UUID documentation. I can use UUID but only if I encode the raw bytes myself.
Those *are* the encoded raw bytes. Why can't you just write the UUID, which is 0-255 (which is what ISO-8859-1 defines)?
Ken Blair
Ranch Hand
Joined: Jul 15, 2003
Posts: 1078
posted
0
David Newton wrote:Those *are* the encoded raw bytes. Why can't you just write the UUID, which is 0-255 (which is what ISO-8859-1 defines)?
What happens when you write "c46f6ae1-7b2e-4baa-8df8-05fbdd91c075" into a 16 character field? You can't. Every single one of those character is going to get encoded into ISO-8859-1, one byte each, it's more than double the length of the field.
Furthermore, what happens when I grab the raw 16 bytes and encode each byte into ISO-8859-1 as 1 character per byte, as opposed to hexadecimal which requires two characters per byte, and the byte happens to be 7F? I'm betting the DB is going to puke at me and even if it doesn't I'm not sure what will happen when it gets to some place like SF and a salesperson is trying to copy & paste an ID with a control character in it.
You said the field was encoded in 8859-1. The bytes are all valid 8859-1 characters. *You* were the one that was talking about encoding the raw bytes and using toString, not me. You never said anything about needing a printable representation--not sure how I would be expected to guess that. In any case, good luck.
Ken Blair
Ranch Hand
Joined: Jul 15, 2003
Posts: 1078
posted
0
You said the field was encoded in 8859-1
Yes. The field is a char(16) and the encoding is 8859-1.
The bytes are all valid 8859-1 characters.
8859-1 doesn't cover all 256 possible values does it? Aren't quite a few ranges "unused"?
*You* were the one that was talking about encoding the raw bytes and using toString, not me.
Yes, I'm thinking on paper. toString() is unsuitable because the length is too long because it's hex. I don't see anything in the UUID class that will give me a String or otherwise encode the bytes into something 16 characters in length. Hence I would have to do it myself. I don't see how that's possible with 8859-1 because of the above concerns.
You never said anything about needing a printable representation--not sure how I would be expected to guess that.
It didn't occur to me until I was typing that. So realistically, there's no way to get a UUID into a 16 character string or am I missing something?