hi everybody! I have a quick question about HashMap. Even for the best hashing algorithm we should probably consider the case when for two distinct objects of the same type hashCode() method returns same integer number. So my understanding is that calling MyHashMap.put(String term, Object myLovelyObject) does not guarantee that for two different Strings we will get two different locations in our HashMap (collision may occur). So given that all strings in my program are unique, do we still have to manually check the return value of MyHashMap.put(...) method to be a null value, just to make sure we don't accidentally dump anything? I am a little confused. Please help me.
A collision will not cause anything to get dumped. However, storing something with the same key will. Suppost I make the following calls:
Also, assume I have a really bad hashing algorithm that always maps to the same hash. At the end of the code, I will have to values in my map. key1/value2 and key2/value3. They will be stored under the same hash and not give me increased performance (over a list), but they will both be stored.
and I am glad to become part of JavaRanch community thank you.
Well, this means we can disregard the collision issue all together. ( I was suspecting that - otherwise that's too much low-level detail for a Java programmer ) However, I am also a little interested how HashMap implements the collision case internally. You may say "go diggin' source code for HashMap" but I still hope somebody could give me a clue.
A hash table (Java's, or any other) is basically an array of lists.
The horizontal lines are the cells of the array. Each cell is called a "bucket". The hashcode of your key is used to choose a bucket by computing an index into the main array (using shifts and/or the modulo operator, generally).
Once the map has chosen a bucket, it stores the [key, value] pair into that bucket (in the case of java.util.Map, as a Map.Entry object). If there's already something in the bucket, that's fine; the new entry is just added to that bucket's list.
To look up an item from the key, you compute the bucket index from the hash code, and then search the list in that bucket for the key. In a lightly loaded map with a good hash function, most buckets will have 0 or 1 item in them, so the search takes constant time. If you have a bad hash function or an overloaded map, this searching can come to take a long time. In the diagram I drew, some buckets are empty, while others have 1, 2, or 3 pairs in them. This is typical.
There are various fancy-pants ways of improving this general scheme, but that's how it works in broad strokes.
Bob Sedgewick's "Algorithms" books are worth owning; you'll learn this kind of fundamental computer science from them.
that's me again. well I did go to source code (HashMap.java). It was scary first but after about 10 mins I knew all the internals of HashMap. It uses an array of Entry references and each Entry is actually a linked list node. So actual Hash table contains reference to head node and every node has a link to next node. When we have a collision it stores new Object in a new node of the Linked List. When we call MyHashMap.get(Object key) it does not just go to cell of the table but it goes through the linked list and compares passed key Object and stored Object using .equals(key) method. Well, this again proves I shouldn't be scared of source code
Anyway, thanks for help and I will definitely continue with JavaRanch and Head First books!!!