This week's book giveaway is in the OCAJP 8 forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line! See this thread for details.
Is HashSet generally known to be the fastest collection available in J2SE? I've setup a HashSet without about 1000 different numbers in it which I am just using to do a comparison to see if the number is in the collection. This works pretty well, but is it going to slow down significantly as the set gets larger? Is there a better way to do this? Also, I'm storing the numbers in the HashSet as Strings since this collection requires Objects. Would another object type be better for performance? Thanks, Drew
You might try numeric wrappers, e.g. Integer. Your experiments will tell you whether it's faster or not. If your number distribution is "tight," using an array of booleans would be much faster. If using hash tables, the best thing you can do is pay careful attention to the capacity and load. The hashCode itself is significant, but since you're using either String or a numeric wrapper, it's already taken care of and is probably fine (but might not be depending on your data distribution). You're best off making the capacity a prime number, since a slot is determined by hashCode % capacity. Otherwise, if you use a number with a lot of factors, you have the potential for a high number of collisions (when two objects hash to the same slot in the table). Collisions are the most significant cause of degraded hash table performance. See this page for more information. If you know how many elements you'll need to hold, pre-allocate the hash table so that you have about a 75% load (the default load factor, the point at which the table must grow). This will avoid the need to grow the table. Making it too large can adversely impact iteration. -Jeff- [ March 02, 2004: Message edited by: Jeff Langr ]
Looks like I'm in a bit of a tight spot if I want to follow the rules. Since I have 1000 numbers, according the the link the closest primes are: 1031 2053 I think 1031 would be too small, and maybe 2053 too big (wasteful?) Should I just use the no arg constructor and let the API do it's thing? Drew
Joined: May 14, 2003
Try something from here, perhaps around 1300: http://www.utm.edu/research/primes/lists/small/1000.txt In any case, experiment. Run a timing test using the no-arg constructor and see if it makes a difference. In fact, it probably doesn't matter; 1000 numbers is a pretty small amount to insert. Don't sweat it until it's a problem, and be confident knowing that you have the knowledge to quickly improve upon the performance if necessary. -Jeff-