This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
What exactly is a HashSet? I understand it is an unordered Set, but what does it have to do with the hashCode() method? In K&B p. 542 it says "the more efficient your hashCode() implementation the better access performance you'll get". So the hashcode is used somehow for storage and retrieval.
I guess a corollary to this question is how is a HashSet stored? Is it stored as a tree sorted by hashcode?
Go back a few pages ( 529-525 ). I think that will help answer a lot of your questions.
Joined: Nov 15, 2006
I looked at the section you mentioned in K&B. It talks about hashcodes. I understand hashcodes and I understand sets. I just don't understand what hashcodes have to do with sets. This is not explained explicitly in K&B.
But anyway back to the question. I understand how hashcodes and maps work together. This is the obvious use of hashcodes to store and retrieve information with good performance.
The not so obvious one is sets. Sets are collections of unique objects. A TreeSet makes sense to me. Since the tree is maintained in order, there can be a search for an element and if it matches then it isn't added to the TreeSet.
A HashSet doesn't say how the set is stored which seems important. Is it an array or linked list or a tree? I don't know. It just feels very fuzzy to me. Does anyone have any more concrete information regarding HashSets?
So your saying HashSet is really using HashMap under the covers. So does this mean that the [key, value] pair is a same to same relationship. For example in the case of a set of Strings:
Key Value "apple" "apple" "grape" "grape" "cherry" "cherry"
Is that the way it works?
HashSet internally uses HashMap. To be more precise this is how it is maintained inside an HashMap object:
So when you add an element to HashMap using add(E o), this method internally calls map.put(o, PRESENT) method. And here PRESENT is an Object reference. Which goes like this:
So when you store "apple", "grape", "cherry" in your HashSet it gets stored in HashMap as
KEY VALUE -------------- "apple" PRESENT "grape" PRESENT "cherry" PRESENT --------------
SCJP 5.0 77%
Joined: Nov 15, 2006
Thank you very much. That's very interesting. So PRESENT is a constant object. I assume at retrieval time if Object == PRESENT, then the Key is returned as the Value.
Is that right? Anyway, you've answered my original question. The rest is just implementation details. Still it's very interesting to see how it's implemented. It's great that Sun releases the source code unlike Microsoft. Actually if you've ever seen any MS source code (they do release some), it's unbelievably convoluted. Sun sources seem to be so much clearer and easier to read. Just a different culture I guess. Go Java!