This week's giveaway is in the EJB and other Java EE Technologies forum. We're giving away four copies of EJB 3 in Action and have Debu Panda, Reza Rahman, Ryan Cuprak, and Michael Remijan on-line! See this thread for details.
the statement (1) will add s1 object into the hash set. when (2) is executed, it will check if there is already an object in hs that is equals to s2. Since s1 and s2 are equal so the call at (2) will have no effect on hs. This means that no new object will be added. This is why the last two calls to hs.size() returns the same value i.e. 3. [ August 11, 2008: Message edited by: Ankit Garg ]
Hash*** collections use hashCode() AND equals() methods of objects to determine if a given object is held within the collection. It is important to know that hashCode is only a 'pre-eliminary' check. Elements of a hash collection are stored in a kind of 'sub-collections', each of them labeled with a specific hash-value. The check, if a given object is held within a collection is performed in two steps: 1) hashCode of objects is retrieved and the collection checks if there is already a sub-collection marked with this hashCode; if not - the object is not in the collection, end of check process 2) if 1) holds true, the matching sub-collection is iterated and equals() invoked on each member until it returns true (object found) or the end of sub-collection is reached.
I'm going to try my usual thing and break it down to "real simple" idea of what hashing does for you.
Let's say you have 100 different toys to put away but you want to be able to find them with reasonable speed. If you put all 100 into one bucket then you must search through the entire bucket every time you want a toy. There HAS to be a better way.
So let's say that one day you sit down and decide that, the toys are ALL one of 5 colors: red, blue, green, yellow, and orange. So you go and get four more buckets for a total of 5 and you sort out all of your toys into the five buckets according to colors. NOW when you want a toy you first think, what color is it? Once you have the color you know to go to that bucket and you only have to search through the one color bucket which can reduce the time you spend seraching to find JUST the RIGHT toy.
But note a few things:
a) while many toys MAY have the same color and go in the same color bucket, that does NOT make them the SAME TOY. The toys in that bucket are NOT necessarily the same so SOME searching will be required.
b) as you collect toys and add to your collection it is good to have the toy colors spread out evenly by choosing your colors well. If you do not and wind up with one blue toy, no yellow, 3 orange toys, 2 green toys, and ALL of the rest are red, then you will STILL hit a slowdown when you hit the red bucket because you STILL have to search for that special toy that you want and there will be ( initially ) 94 red toys which isn't all that much fewer than the whole 100.
If we start dragging this towards Java hashcodes, all that the hashcode winds up being is a unique integer value that determines which BUCKET this particular object goes in to.
Since it is an integer, it is actually legal to return the same integer for all hashCode calls. But not the best idea because that means that all of your objects are in one bucket.
At the same time it is not the best idea to have hashCodes TOTALLY unique. Each object would be in it's own bucket and that is not necessarily the most space and time efficient choice to make.
The hashCode should depend on the attributes of the particular object that you are storing meaning, the variables that you choose to create your hashCode method. You can make it a simple calculation such as adding the variables or adding the hashCodes of the variables involved. If you know that you would like to see 10 buckets then do a calculation and do a "remainder" or "modulo" of 10 ( calculatedValue % 10 ).
I understand that, if you are intending to override equals() that the hashCode should be calculated on the SAME variables that will be used to determine the equality of the objects. That way the same data controls which hash bucket each object falls into as well as the results of the search itself.
I hope that this approaches what you wanted to know.
SCJP - 86% - June 11, 2009