This week's book giveaway is in the OCMJEA forum. We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line! See this thread for details.
I've seen that question before somewhere as well. It's not phrased very well, but basically this is what the statement means: given the hashcodes of two objects, you can sometimes draw a definite conclusion about whether the objects are different, but you can never draw a definite conclusion about whether they are equal.
My thoughts on this are, dissimilar items can have the same hashcode. All a hashcode does is determine which "bucket" an item belongs in. But, if two objects are identical then their hashcodes should be the same.
SCJP - 86% - June 11, 2009
This is what the JDK Documentation states in the hashCode method for the class object:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
when objects are very complex the equals method usually has to compare a lot of fields the object owns. This can be too time consuming.
Hash codes are used as a short cut. In hash codes you compare only some of the fields of an object an calculate an int value out of it - frequently by doing some XOR operations. Because hashCode() compares only some essential fields while equals compares all of them, the hashCode() method should be faster than equals().
If a class is designed in that way hash codes save valuable time, for example in the hashing process when using a set.
A set should not contain two equal objects. Instead of invoking the equals method all the time when an object is added to the set, hashCode() is invoked first. When the hash codes returned are different, the objects have to be different as well. Therefore the (more time consuming) method equals() is not performed at all when the hash codes are different because the objects must be different as well. Only if the hash codes are the same, the equals method is invoked, doing the comparisons of all the fields needed to compare if the object has the same state as any other object already contained in the set.
Some demo code:
The first class treats every objects as different in spite of the fact that the equals method returns true in every case. This class violates the hash code contract. As a result two equal objects are stored in a set that should not contain two equal objects. And you see, that the equals method is not invoked in this case, because nothing is printed from the equals method body.
The second class does not violate the contract (equal objects return the same hash code). Therefore only one of these objects will be stored in the set. And you see, that this time, also the equals method runs.
Output of the code is [TrueEqualDifferentHash@1004901, TrueEqualDifferentHash@1e63e3d] ---------- equals test performed on TrueEqualSameHash@7 with TrueEqualSameHash@7 [TrueEqualSameHash@7]
The two objects in the first set look different, because their hash code is different, but if you invoke equals() directly on them you know that true would be returned. By the way, you don't have the output from the equals method when storing the first object in the set in any case, because there's nothing to compare then.