| Author |
Generics and Collections: Duplicates in HashSet allowed?
|
Thomas Loder
Greenhorn
Joined: Oct 07, 2007
Posts: 2
|
|
Hello, I am confused. According to the API a Set is "a collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2)...". In my code below however I am able to add duplicates though equals() keeps telling me that the Objects in questions (p1 and p2) are equal. Due to my experience duplicates are refused only if they are Strings or Integers (without using typing). What am I missing here? I appreciate any help. Cheers Thomas [ October 07, 2007: Message edited by: Thomas Loder ] [ October 07, 2007: Message edited by: Thomas Loder ]
|
 |
Jeanne Boyarsky
internet detective
Marshal
Joined: May 26, 2003
Posts: 26216
|
|
Thomas, Welcome to JavaRanch! There are two reasons this code will not work as expected: 1) Strings should be compared using equals() not ==. You are lucky in that it works in this particular example since the Strings are defined as constants. If one of them was created through substring, equals() would return false. 2) According to the Java specification, whenever you override equals() you need to override hashCode() too. HashSet uses hashCode() rather than the equals() to check for equality. It does this because it needs the hash code (a number representing the object) to determine where in the set to store that Object. You can verify this by setting a breakpoint in equals() or adding a System.out.println there. This is a barebones hashCode method that will work if name is guaranteed not to be null. (If you are doing more than just testing behavior, it is advisable to add the null check.) JavaDoc snippet on this:
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
|
[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
|
 |
Ilja Preuss
author
Sheriff
Joined: Jul 11, 2001
Posts: 14112
|
|
Originally posted by Jeanne Boyarsky: HashSet uses hashCode() rather than the equals() to check for equality.
Small correction: it uses *both* - hashCode() first, to find the right bucket, but then it still needs to use equals(), because the hashcode is allowed to be ambigous.
|
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
|
 |
Jeanne Boyarsky
internet detective
Marshal
Joined: May 26, 2003
Posts: 26216
|
|
Originally posted by Ilja Preuss: Small correction: it uses *both* - hashCode() first, to find the right bucket, but then it still needs to use equals(), because the hashcode is allowed to be ambigous.
Thanks Ilja. I had tried it with values that weren't even the same hashCode and therefore didn't get that far.
|
 |
Thomas Loder
Greenhorn
Joined: Oct 07, 2007
Posts: 2
|
|
Thanks for shedding some light
|
 |
Bob Ruth
Ranch Hand
Joined: Jun 04, 2007
Posts: 318
|
|
I've just been over that in the K&B. If you do not override hashCode() then the hashCode() that you inherit from Object, from what they say, is just about guaranteed to be unique to that specific instance. In other words you could instantiate two object using the exact same initialization data and they would still have different hash codes. That would probably account for your ability to add (seemingly) duplicate objects.
|
------------------------
Bob
SCJP - 86% - June 11, 2009
|
 |
Jay Sinha
Greenhorn
Joined: Jun 03, 2004
Posts: 2
|
|
One additional point.. hashCode is considered only for HashSet and LinkedHashSet (ie. hashing is used). But in case of TreeSet, it performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
Summary:
- In case of HashSet & LinkedHashSet, it first checks hashCode and then use equals() in the bucket.
- In case of TreeSet, it depends on the return value of compareTo (or compare) methods. If return is ZERO, it wouldn't allow insertion otherwise YES.
Cheers !!!
Jay Sinha
|
Regards
Jay Sinha
|
 |
 |
|
|
subject: Generics and Collections: Duplicates in HashSet allowed?
|
|
|