This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes Java in General and the fly likes Duplicated random numbers Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Duplicated random numbers" Watch "Duplicated random numbers" New topic
Author

Duplicated random numbers

Barry Harvey
Greenhorn

Joined: Aug 26, 2004
Posts: 4
We have a situation where we are running the same code in separate JVMs. Part of what the code does is generate a file with a name based on the datetime to the second and just to make sure, a random number.

Unfortunately, we've just had a situation where two processes generated exactly the same datetime and, I assume because the dates were the same, the same random number. Both processes were running the same code but were in separate JVMs.

My question is, do I have to use some third shared party (such as referring to our database) to generate a sequence number, or is there some other reference that would make each instance unique?

I was hoping that hashcode would be the answer. Could someone tell me if two objects from different JVMs could have exactly the same hashcode?
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Yes they can and do, absolutely.

Can you just use java.io.File.createTempFile() ?


[Jess in Action][AskingGoodQuestions]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Barry Harvey:
Could someone tell me if two objects from different JVMs could have exactly the same hashcode?


Actually the hashcode isn't even guaranteed to be unique inside a single JVM.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Dave Wingate
Ranch Hand

Joined: Mar 26, 2002
Posts: 262
Perhaps you could use Process ID or Thread ID to ensure that the file names are unique. I've never had to do this in Java, but it works well in Bash scripting.


Fun programming etcetera!
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
We use a multi-part key where we get the first part from a database sequence number and increment a simple counter for the second part. The part sizes are configurable; right now the counter is 3 digits so I go to the database for a new sequence number every 1000 keys. A cluster of six servers shares the database, and the database manages concurrency in getting the first parts.

If don't want to use a database you could designate one server in your cluster as the high-order-part vendor. It could use some persistence scheme or any other technique to make sure it vends unique parts.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Barry Harvey
Greenhorn

Joined: Aug 26, 2004
Posts: 4
We're going to try the unix process id idea.

The class involved is purely used to run unix commands by writing them out to scripts and running them. Dropping out to unix to call ps fits in nicely with the workings of the class.

Thanks everyone for the advice.
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Just curious cause I don't know Unix ... if you run PS and get back a list of processes, how do you know which one to use? Could there be a bunch called "java" and maybe some called "ps" ?
M Beck
Ranch Hand

Joined: Jan 14, 2005
Posts: 323
the command-line options to "ps" can be tuned almost indefinitely; for example excluding processes owned by other users and processes executing in the background or on other TTYs might help. with a bit of ingenuity and a lot of poring through the man page for the ps command, one can usually narrow the field sufficiently.

but really, the best way to do this would be for the Java API to have a method somewhere to return the PID of the current JVM. i can't imagine why one doesn't seem to exist - even if the JVM has been ported to run on platforms that have no equivalent concept to the "process ID", the sensible thing for them to do would be to just throw a checked exception indicating the method is not applicable.
Jeroen Wenting
Ranch Hand

Joined: Oct 12, 2000
Posts: 5093
Back to mathematics and random: random means that there's no guarantee about which will turn up.
Getting the same number 10 times in a row can therefore happen in a random sequence of numbers, but the probability of it happening is low.

Far better to use a timestamp down to millisecond level, maybe in combination with a random number.


42
Horatio Westock
Ranch Hand

Joined: Feb 23, 2005
Posts: 221
If I understand the requirement correctly, you want to have files named differently across a number of different machines, presumably the files are going to some shared network storage or something. I also presume that these files are aggregated to the shared storage by some external process such that the name can't be checked for uniqueness at creation time. (Note that you could write you aggregation process such that it renames a file if it finds a duplicate name during the process).

If this is the case, I'd say that ProcessId is a poor choice. It seems like this is one of the more likely candidates for a collision between servers.

I think Stan's suggestion is probably the best. The servers are requesting a unique id (or block of) from a master process which manages them.

If you really want independantly generated (very likely to be) unique numbers, something like a UUID generator would probably be a good solution.

There are various implementations around: jakarta commons (sandboxed), jini, one called 'JUG'. It would be worth reading the docs for each and seeing which is suitable for you. Some use the MAC address from the network card and this requires some native code.

Hope this helps.
M Beck
Ranch Hand

Joined: Jan 14, 2005
Posts: 323
a random number appended to a string identifying which process, on which machine, created it, might also be unique. down the lines of some USENET message-id's, maybe; "machine.name.com : process-id-of-some-kind : date-time-stamp : random-number". it'd be long-ish, but filenames don't have to be short, do they?
[ April 07, 2005: Message edited by: M Beck ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Duplicated random numbers