This week's book giveaway is in the OCPJP forum. We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line! See this thread for details.
We have a situation where we are running the same code in separate JVMs. Part of what the code does is generate a file with a name based on the datetime to the second and just to make sure, a random number.
Unfortunately, we've just had a situation where two processes generated exactly the same datetime and, I assume because the dates were the same, the same random number. Both processes were running the same code but were in separate JVMs.
My question is, do I have to use some third shared party (such as referring to our database) to generate a sequence number, or is there some other reference that would make each instance unique?
I was hoping that hashcode would be the answer. Could someone tell me if two objects from different JVMs could have exactly the same hashcode?
Originally posted by Barry Harvey: Could someone tell me if two objects from different JVMs could have exactly the same hashcode?
Actually the hashcode isn't even guaranteed to be unique inside a single JVM.
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
We use a multi-part key where we get the first part from a database sequence number and increment a simple counter for the second part. The part sizes are configurable; right now the counter is 3 digits so I go to the database for a new sequence number every 1000 keys. A cluster of six servers shares the database, and the database manages concurrency in getting the first parts.
If don't want to use a database you could designate one server in your cluster as the high-order-part vendor. It could use some persistence scheme or any other technique to make sure it vends unique parts.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Joined: Aug 26, 2004
We're going to try the unix process id idea.
The class involved is purely used to run unix commands by writing them out to scripts and running them. Dropping out to unix to call ps fits in nicely with the workings of the class.
Thanks everyone for the advice.
Joined: Jan 29, 2003
Just curious cause I don't know Unix ... if you run PS and get back a list of processes, how do you know which one to use? Could there be a bunch called "java" and maybe some called "ps" ?
the command-line options to "ps" can be tuned almost indefinitely; for example excluding processes owned by other users and processes executing in the background or on other TTYs might help. with a bit of ingenuity and a lot of poring through the man page for the ps command, one can usually narrow the field sufficiently.
but really, the best way to do this would be for the Java API to have a method somewhere to return the PID of the current JVM. i can't imagine why one doesn't seem to exist - even if the JVM has been ported to run on platforms that have no equivalent concept to the "process ID", the sensible thing for them to do would be to just throw a checked exception indicating the method is not applicable.
Back to mathematics and random: random means that there's no guarantee about which will turn up. Getting the same number 10 times in a row can therefore happen in a random sequence of numbers, but the probability of it happening is low.
Far better to use a timestamp down to millisecond level, maybe in combination with a random number.
If I understand the requirement correctly, you want to have files named differently across a number of different machines, presumably the files are going to some shared network storage or something. I also presume that these files are aggregated to the shared storage by some external process such that the name can't be checked for uniqueness at creation time. (Note that you could write you aggregation process such that it renames a file if it finds a duplicate name during the process).
If this is the case, I'd say that ProcessId is a poor choice. It seems like this is one of the more likely candidates for a collision between servers.
I think Stan's suggestion is probably the best. The servers are requesting a unique id (or block of) from a master process which manages them.
If you really want independantly generated (very likely to be) unique numbers, something like a UUID generator would probably be a good solution.
There are various implementations around: jakarta commons (sandboxed), jini, one called 'JUG'. It would be worth reading the docs for each and seeing which is suitable for you. Some use the MAC address from the network card and this requires some native code.
Hope this helps.
Joined: Jan 14, 2005
a random number appended to a string identifying which process, on which machine, created it, might also be unique. down the lines of some USENET message-id's, maybe; "machine.name.com : process-id-of-some-kind : date-time-stamp : random-number". it'd be long-ish, but filenames don't have to be short, do they? [ April 07, 2005: Message edited by: M Beck ]