GeeCON Prague 2014*
The moose likes Java in General and the fly likes HashMap collisions question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Java in General
Bookmark "HashMap collisions question" Watch "HashMap collisions question" New topic
Author

HashMap collisions question

Nikolay Gudovich
Greenhorn

Joined: Jan 11, 2005
Posts: 12
hi everybody!
I have a quick question about HashMap. Even for the best hashing algorithm we should probably consider the case when for two distinct objects of the same type hashCode() method returns same integer number. So my understanding is that calling MyHashMap.put(String term, Object myLovelyObject) does not guarantee that for two different Strings we will get two different locations in our HashMap (collision may occur). So given that all strings in my program are unique, do we still have to manually check the return value of MyHashMap.put(...) method to be a null value, just to make sure we don't accidentally dump anything? I am a little confused. Please help me.

Thanks a lot.
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 30596
    
154

Nikolay,
Welcome to JavaRanch!

A collision will not cause anything to get dumped. However, storing something with the same key will. Suppost I make the following calls:

Also, assume I have a really bad hashing algorithm that always maps to the same hash. At the end of the code, I will have to values in my map. key1/value2 and key2/value3. They will be stored under the same hash and not give me increased performance (over a list), but they will both be stored.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Nikolay Gudovich
Greenhorn

Joined: Jan 11, 2005
Posts: 12
and I am glad to become part of JavaRanch community thank you.

Well, this means we can disregard the collision issue all together. ( I was suspecting that - otherwise that's too much low-level detail for a Java programmer ) However, I am also a little interested how HashMap implements the collision case internally. You may say "go diggin' source code for HashMap" but I still hope somebody could give me a clue.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

A hash table (Java's, or any other) is basically an array of lists.



The horizontal lines are the cells of the array. Each cell is called a "bucket". The hashcode of your key is used to choose a bucket by computing an index into the main array (using shifts and/or the modulo operator, generally).

Once the map has chosen a bucket, it stores the [key, value] pair into that bucket (in the case of java.util.Map, as a Map.Entry object). If there's already something in the bucket, that's fine; the new entry is just added to that bucket's list.

To look up an item from the key, you compute the bucket index from the hash code, and then search the list in that bucket for the key. In a lightly loaded map with a good hash function, most buckets will have 0 or 1 item in them, so the search takes constant time. If you have a bad hash function or an overloaded map, this searching can come to take a long time. In the diagram I drew, some buckets are empty, while others have 1, 2, or 3 pairs in them. This is typical.

There are various fancy-pants ways of improving this general scheme, but that's how it works in broad strokes.

Bob Sedgewick's "Algorithms" books are worth owning; you'll learn this kind of fundamental computer science from them.


[Jess in Action][AskingGoodQuestions]
Nikolay Gudovich
Greenhorn

Joined: Jan 11, 2005
Posts: 12
that's me again.
well I did go to source code (HashMap.java). It was scary first but after about 10 mins I knew all the internals of HashMap. It uses an array of Entry references and each Entry is actually a linked list node. So actual Hash table contains reference to head node and every node has a link to next node. When we have a collision it stores new Object in a new node of the Linked List. When we call MyHashMap.get(Object key) it does not just go to cell of the table but it goes through the linked list and compares passed key Object and stored Object using .equals(key) method. Well, this again proves I shouldn't be scared of source code

Anyway, thanks for help and I will definitely continue with JavaRanch and Head First books!!!
 
GeeCON Prague 2014
 
subject: HashMap collisions question