aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Hashing Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Hashing Question" Watch "Hashing Question" New topic
Author

Hashing Question

Pritish Chakraborty
Ranch Hand

Joined: Jun 12, 2012
Posts: 91

Hello all.

I have a little doubt regarding the efficiency of this hashCode() implementation -:



Yes, I know that it is a legitimate implementation that is bound by the contract.

However, note how equals() declares objects y and a equal because they have the same x-value. If I use y and a to map two different items, will they not be placed in the same bucket?

Does that not make this implementation slightly inefficient? I think we would prefer it if we get different hashcodes as much as possible...I just want to confirm this, because I'm revising and this keeps bothering me.

PS : This was a simple example from the book, of course it can be inefficient, and this is just my own doubt.


OCJP 6
Bharat Kasodariya
Ranch Hand

Joined: Aug 19, 2011
Posts: 36
Normally, hashcode and equal are used to implement own equal rules. unique value field is used in hashcode calculation. hashcode is used in hashing api like hashmap.
As per equal and hashcode contract if two object are equal then their hashcode must be same and they will go in same bucket.

In real time use x should be unique in your case

for your reference
http://www.technofundo.com/tech/java/equalhash.html
Pritish Chakraborty
Ranch Hand

Joined: Jun 12, 2012
Posts: 91

How come I never thought of that?? Of course, a Set or Map implementation with the hashCode always asks for no duplicates...

Sometimes the solutions are right under my nose, yet they never occur to me...lol.

Thanks.
vikky agrawal
Ranch Hand

Joined: Dec 18, 2008
Posts: 65


There is some discrepancy in the code. I think correct implementation should be like this:

Correct me if I am wrong.

Now coming to the point of hashing, according to this implementation the code will always return same hash value, if we supply similar integers inside this function call, hence all those entries will hash to the same bucket.
and based on the specification if two objects are equal according to equals method then hashCode must return true. Although this implementation will separate the objects having different value but still is inefficient as for large values supplied to the method the hashCode will return larger values and either we need large bucket or any other method (eg modulus %) to identify correct bucket. Hence a lot of values will coincide in the same bucket. And finding for equality will be inefficient if hashCode is inefficiently implemented.

Its best to use quadratic probing while implementing hash functions:
source: http://en.wikipedia.org/wiki/Quadratic_probing
Hope this helps to resolve your doubt.

scjp 6
http://algorithmsea.blogspot.com
Anayonkar Shivalkar
Bartender

Joined: Dec 08, 2010
Posts: 1512
    
    5

vikky agrawal wrote:Correct me if I am wrong

If you want to override Object class' hashCode method, then it should not accept any parameter (e.g. int x) - otherwise, it would be overloading.
So, correct way to write hashCode method is:

Going further, it will break the hashCode-equals contract (because, since you've not overriding Object's hashCode method, your hashCode method will never be called by get/put method of Map etc.).

That is why, it is necessary to use correct signatures for hashCode and equals method (e.g. equals method accepts Object as an argument, but lot of time, it happens that people provide current class' object as an argument - i.e. object of HasHash).

I hope this helps.

Regards,
Anayonkar Shivalkar (SCJP, SCWCD, OCMJD, OCEEJBD)
Seetharaman Venkatasamy
Ranch Hand

Joined: Jan 28, 2008
Posts: 5575

vikky agrawal wrote:
There is some discrepancy in the code. I think correct implementation should be like this:


If you use java 1.6
You can always check you are using correct method signature by using @Override annotation
vikky agrawal
Ranch Hand

Joined: Dec 18, 2008
Posts: 65
@Anayonkar Shivalkar and Seetharaman Venkatasamy

Thanks for correcting.. I overlooked the method signature and didn't paid thorough attention, as I was trying to explain mainly Hashing rather than hashCode signature.

Vikky
Don Redd
Ranch Hand

Joined: Jan 05, 2012
Posts: 82

I think, the implementation of hashCode() given above is not performance friendly. the main goal of hashing it to map, lets say m elements to n buckets. where m >>>>>>n;

Above implementation enforces n to be equals to m, also each bucket contains at most of one element. Search, insert, delete functionalists of Hash Table are equals to O(1) normally. Above implementation can take more than constant time for basic operations mentioned before.


Regards,
Don,Redd.
Anayonkar Shivalkar
Bartender

Joined: Dec 08, 2010
Posts: 1512
    
    5

Don Redd wrote:I think, the implementation of hashCode() given above is not performance friendly. the main goal of hashing it to map, lets say m elements to n buckets. where m >>>>>>n;

I don't think so. I would say the best hashing function would be the one which maps m elements to n buckets, where m is approximately equal to n.
That way, equals method would have a very less load.
If m >> n, then one bucket would contain too many elements, and equals method would have to do a lot of work.

I hope this helps.
Don Redd
Ranch Hand

Joined: Jan 05, 2012
Posts: 82

if m==n, then finding the right bucket will take time, even though equals() is O(1). So,What I meant is, we have to implement hash function in a way that 1) finding right bucket and 2)comparing each element in bucket, are load balanced.
Anayonkar Shivalkar
Bartender

Joined: Dec 08, 2010
Posts: 1512
    
    5

Don Redd wrote: if m==n, then finding the right bucket will take time, even though equals() is O(1). So,What I meant is, we have to implement hash function in a way that 1) finding right bucket and 2)comparing each element in bucket, are load balanced.

No, it doesn't have to search for bucket itself.

The normal flow is:
hashCode method returns the hashCode. The hashCode itself is treated as id of the bucket, and that bucket is directly accessed.
Further, if the bucket is containing more than one element, all the elements are checked for equals method.

So, finding the bucket itself is not much of a hassle. The real performance drop happens when one bucket contains too many elements.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Hashing Question