aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Doubt regarding hashCode() contract Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Doubt regarding hashCode() contract" Watch "Doubt regarding hashCode() contract" New topic
Author

Doubt regarding hashCode() contract

Lalit Bansal
Ranch Hand

Joined: Jan 29, 2007
Posts: 63
The hashCode() contract says:
If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.

My question is, "Is it necessary that the above statement should always be true?". Consider the following case:

public class MyClass {
private int aVariable;
private int bVariable;

public MyClass(int aVariable, int bVariable){
this.aVariable = aVariable;
this.bVariable = bVariable;
}

public boolean equals(Object o) {
if((o instanceof MyClass) && (((MyClass)o).aVariable
== this.aVariable)) {
return true;
} else {
return false;
}
}

public int hashCode() {
return bVariable * 31;
}

public static void main(String[] args) {
MyClass obj1 = new MyClass(10,12);
MyClass obj2 = new MyClass(10,13);

System.out.println(obj1.equals(obj2));
}
}

In the above example, the equals method shall return true even if there hashcodes are different.
In K&B book, they have mentioned in a statement that "Your hashCode() method should use the same instance variables (as that in the equals() method).". But this statement is nowhere mentioned in the hashCode() contract.
Please explain this difference.

Thanks in advance.
Burkhard Hassel
Ranch Hand

Joined: Aug 25, 2006
Posts: 1274
Howdy!


In K&B book, they have mentioned in a statement that "Your hashCode() method should use the same instance variables (as that in the equals() method).". But this statement is nowhere mentioned in the hashCode() contract.
Please explain this difference.


The difference is:
It is wise to do take the same instance variables for hash. Because it makes the code clear and the hashing faster.
But it is not absolutely required.

You fulfill the contract also when you return always the same int in the hashCode() method.
Then also hashCodes are the same - with the objects being equal or not.
Doing this way you fulfill the contract. Only the hashing is slowly because you cannot rely on hashing to exclude anything (e.g. for not putting two equal objects into a Set), you will have to check the slower equals() for every add.

Normally, if you put something into a HashSet - therefore the name by the way - first a hashing is made, and when the hash is different from all entries in the list, the object will be treated as unequal without really calling the equals() method.

Your MyClass however does not fulfill the contract, both obj1 and obj2 can be stuffed into one set - despite the fact that due to your equals() method, they are meaningfully equal.
If you change that hashCode in your class to return 42; (always the same int) this could not happen.


Yours,
Bu.


all events occur in real time
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14435
    
  23

Note, do not confuse these two situations:

1. Objects that are equal but return different hash codes (what Lalit's question was about)

2. Objects that are not equal but return the same hash code.

The contract for hashCode() is: If two objects are equal, then calling hashCode() on both objects must return the same value.

If you break this contract, as Lalit does in his example, then unpredictable things can happen if you store your objects in a collection that uses hash codes such as HashSet or HashMap. So Lalit, to answer your question: "Is it necessary that the above statement should always be true?" - Yes, otherwise you might get strange problems when storing your objects in certain kinds of collections.

Note that the contract does not say anything about case 2 above. So different objects may have the same hash code. However, collections such as HashSet and HashMap will work inefficiently if different objects have the same hash code.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Lalit Bansal
Ranch Hand

Joined: Jan 29, 2007
Posts: 63
Jesper and Burkhard,

Thanks for the explanation.

Still a Bee Sting left in my mind.

When we talk about a contract in java (say interface...which we say that it acts as a contract), we are bound to follow it else compiler gives error (in case of classes implementing interfaces and not implementing one or more of its methods throws compiler error).

The hashCode() is also termed as a contract. So if I'm not binding to it, why doesn't the compiler throw an error?
Rahul Bhattacharjee
Ranch Hand

Joined: Nov 29, 2005
Posts: 2308
Originally posted by Lalit Bansal:

The hashCode() is also termed as a contract. So if I'm not binding to it, why doesn't the compiler throw an error?


Its not an interfacial contract .It is just another method that you are trying to override.This is the way the compiler takes it.


Rahul Bhattacharjee
LinkedIn - Blog
Lalit Bansal
Ranch Hand

Joined: Jan 29, 2007
Posts: 63
Rahul,

Thanks for the response.

I think you didn't get the point I wanted to ask.
The contract is saying:
".....if two objects are equal ......hashCode() method .....MUST produce the same integer result".

So, if I'm not binding to it, why isn't compiler doing something for it?
Rajah Nagur
Ranch Hand

Joined: Nov 06, 2002
Posts: 239
Originally posted by Lalit Bansal:

So, if I'm not binding to it, why isn't compiler doing something for it?


This is what the javadoc of the equals method says:
.
.
.
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

So, no-where it mandates that you MUST overide hashcode when equals is overidden. Also the compiler works on static code, it can never enforce this rule (also not possible if it wanted to)

If is a good practice to do this so that the results & behaviour of Objects is predictable in collections which rely on hashing.


You can't wake a person who is <b><i>pretending</i></b> to be asleep.<br />Like what <b>"it"</b> does not like - <i> Gurdjieff </i>
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14435
    
  23

The words "contract" and "must" do not necessarily mean that the compiler is enforcing it.
Lalit Bansal
Ranch Hand

Joined: Jan 29, 2007
Posts: 63
Thanks everyone.

The moderator may please close the topic now.

regards,
Rahul Bhattacharjee
Ranch Hand

Joined: Nov 29, 2005
Posts: 2308
Originally posted by Lalit Bansal:

The moderator may please close the topic now.


why to close the topic?
vivek srivastava
Ranch Hand

Joined: Nov 30, 2007
Posts: 39
As told by Jesper “If you break this contract, as Lalit does in his example, then unpredictable things can happen if you store your objects in a collection that uses hash codes such as HashSet or HashMap.”.
I am still not able to understand what may go wrong? Please explain.

-Vivek
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14435
    
  23

You dug up an old post (from 2007), but ok...

That is because of how collection classes such as HashMap and HashSet work. Those collection classes depend on the fact that the objects that you put into them obey the hash code contract. If your objects do not obey the contract, then what happens if you try to store them in a HashMap or HashSet is undefined - in other words, strange, unpredictable things can happen.

If you want to know all the details, you'll have to study how for example HashMap works internally. You don't need to know the exact details for the SCJP exam. Here is some background information.

In for example a HashSet, the values contained in the set are stored in a set of "buckets". Each of those buckets has a number to identify it. When you put a value into the HashSet, it gets stored into the bucket which is identified by the hash code of the object that you put into it. For example, if I have an object for which hashCode() returns 25 then it gets stored into bucket 25 in the HashSet.

If you later want to check if the HashSet contains a certain object, by calling contains(someObject) on the HashSet, the following happens: The HashSet gets the hash code for someObject. It then looks into the bucket that corresponds with the hash code. If there is nothing in the bucket, then it's done, the HashSet doesn't contain someObject, so it returns false. If there are one or more objects in the bucket, the HashSet has to go through all of them and compare them with someObject, by calling equals() on each pair of someObject and object in the bucket. If equals() returns true for a pair, contains() returns true, otherwise it returns false.

The reason why it works like this, is because it is efficient, especially compared to for example a List. If all the objects in the HashSet would be stored into one big list, it would mean that if you want to check if an object is in the set, you would have to compare the input with all the objects in the list. With the scheme with buckets, you only have to compare the input with the content of one bucket - and any bucket normally contains only a small fraction of all the elements in the set.

It's legal to implement your hashCode() method to always return a fixed value, for example like this:

This hashCode() method conforms to the hash code contract, but is bad, because it kills the efficiency advantage of the mechanism with the buckets. All objects would be stored in bucket 0, and when you check the set, it always has to check all the elements in bucket 0 - the whole content of the set.

If you would write a hashCode() method that breaks the contract, the HashSet will look into the wrong bucket, which means that contains(someObject) may return false, even though the object you're looking for really is in the set (in a different bucket).

Note also that you should never put mutable objects in a HashSet or HashMap. If you put such an object in a HashSet and then change the state of the object so that the hash code changes, the object will be in the wrong bucket in the HashSet.

See also: http://en.wikipedia.org/wiki/Hash_table
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Doubt regarding hashCode() contract