I think what Christian means to say is that in general it's not possible to completely avoid collision. For any data type with more that four bytes of data, collisions will be possible. However it's still possible to
avoid them, meaning minimize their chance of occurrance, using a well-chosen hash function.
Stephan van Hulst wrote:Load factor and capacity have nothing to do with collisions. Collisions are purely affected by the quality of the hashCode() method.
I disagree. There are two types of collisions:
1. Two objects have the same hash code, and the HashMap needs to check the equals() method to see if they are the same.
2. Two objects have different hash code, but end up in the same bucket anyway (because the current capacity is less than Integer.MAX_VALUE). In this case, the HashMap just needs to check the actual hashCode() of each object (which was saved when the object was put in the Map) to see if they are the same.
The first type of collision is determined only by hash code, sure. But the second certainly
is affected by the capacity, which is affected by the load factor.
In general, setting a lower load factor will decrease collsions (of the second type, which is much more common) at the cost of using more memory. It's a trade-off.