Lots of duplicates doesn't necessarily mean it's not random, although it certainly sounds suspicious. How, exactly, are you generating the numbers?
How are you storing them in a file, meaning, what makes the output a CSV file? If there's a single number on each line then it's not really a CSV.
Joined: Aug 23, 2007
Thanks for your response. First I am generating the random number and storing the same into a csv file each line of file will have 254 random numbers with ',' (except the last number of each line). I need to open the same file in Microsoft Excel and it only 256 columns due to that only i am doing like this. and the code which i have written for the same is as below;
I tried to work out the chances of your never having any duplicates, and I may have got it wrong, but it was too small a number to display on my calculator. It simply showed "0". That was assuming 2^32 possibilities for SecureRandom#next() which returns an int, not a long.
I won't verify if there are some duplicates. ;) But try to store your numbers in a set and leave the loop if the size of the set tells you having as much numbers as you want.
Just my 2 cents ..
Joined: Oct 13, 2005
Leander Kirstein-Heine wrote:I won't verify if there are some duplicates. ;) But try to store your numbers in a set and leave the loop if the size of the set tells you having as much numbers as you want.
I replied in Shree's previous thread, because that thread seemed to have more info about what I believe the main difficulty here is - ensuring that the numbers are unique. Trying to delete duplicates after the fact is still going to require some way of detecting duplicates. And for ten million numbers, this may be nontrivial. The problem here is comparable to the one in the original post, so I figured as long as it has to be solved, it's better to eliminate duplicates before they are written to the file.
Having said that though, I note that the code above has a simple bug which ensures that the number at the end of each line is duplicated at the beginning of the next line. Removing that bug may be enough to generate files that look, to the casual eye, like they have no duplicates. If you need to ensure this, well, see the other thread for more discussion.
subject: Deleting duplicate numbers from a .csv file.