aspose file tools*
The moose likes Java in General and the fly likes Duplicate elements in HashSet Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Duplicate elements in HashSet" Watch "Duplicate elements in HashSet" New topic
Author

Duplicate elements in HashSet

Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 272
Hi. I am trying to populate a hashset to be used as akeyword list. I chose HashSet because I thought it would weed out the dupes, but I am returning a hash set with all the dupes included. Does anybody know why this would be happening. I really appreciate your consideration. Thank you.
Stefan Krompass
Ranch Hand

Joined: Apr 29, 2004
Posts: 75
Hi,

can you show us some code that lets us understand the problem? Which elements do you add to the Set? According to the API, elements are only added to HashSets when they are not already present.

Stefan
Dun Dagda
Ranch Hand

Joined: Oct 12, 2004
Posts: 54
Hi Tom,
HashSet creates a set of key/value pairs. So you will have a set of unique keys (no duplicates allowed), but any particular unique key can point to more than one value, so the values can be duplicates (the set part only works on the keys).
Second, what objects are you using as keys to populate your hashset? Only objects whose classes have properly overridden the equals and hashCode methods of class Object, such as class String and the Wrapper classes, make useful hashing keys. Others like StringBuffer that have not overridden the equals method do not make good keys. So if you are using your own class-defined objects as keys, they will not work in HashSet unless you also override equals and hashCode.
If you just want a list of unique keywords, maybe SortedSet would be a better collection to use?


SCJP 1.4<br />SCWCD (in progress)
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 272
Thank you guys. What I'm doing is querying an Oracle db to obtain the keyword values...and populating the hash set...

HashSet keywordlist=new HashSet();

while(rs.next()){

Nrow currentresult = new Nrow();
currentresult.setDbvalue(rs.getString("DBVALUE"));
keywordList.add(currentresult);
}

So I guess since the values are equal, they have different hash codes? Thanks again...
Dun Dagda
Ranch Hand

Joined: Oct 12, 2004
Posts: 54
So if I understand your code correctly, you are adding an NRow object to your HashSet as a key value? I think your problem may stem from this, unless NRow overrides equals and hashCode it will probably not work as a hashing key. Can't you convert your DBVALUE reference to a String and use that? Strings make good hashing keys.
Dun Dagda
Ranch Hand

Joined: Oct 12, 2004
Posts: 54
Would this work for you?


Dirk Schreckmann
Sheriff

Joined: Dec 10, 2001
Posts: 7023
Moving this to the Intermediate forum...


[How To Ask Good Questions] [JavaRanch FAQ Wiki] [JavaRanch Radio]
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 272
Hi Dun. I'll give it a try but does a String object override hashCode?
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

Originally posted by Tom Griffith:
Hi Dun. I'll give it a try but does a String object override hashCode?


yes it does.


Groovy
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

Originally posted by Tom Griffith:
Hi Dun. I'll give it a try but does a String object override hashCode?


Check this
[URL=http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html[/URL]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Tom, your Nrow class needs to implement equals and hashcode correctly.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dun Dagda:

HashSet creates a set of key/value pairs.


You're confusing HashSet with HashMap.
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 272
Isn't a value/hash code pairing essentially the same as the pairings found in maps?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Tom Griffith:
Isn't a value/hash code pairing essentially the same as the pairings found in maps?


In which way?
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 272
I don't know, I guess a hashcode is an indexer...a unique key attached to each object in a collection, etc...which seems similar to the value/key coupling in maps...

I guess I have to mess more with maps. I use a lot of sets in beans which I iterate and convert to lists for use on jsp. I haven't tinkered wiht maps much to really distinguish the difference.
[ October 14, 2004: Message edited by: Tom Griffith ]
Dun Dagda
Ranch Hand

Joined: Oct 12, 2004
Posts: 54

Originally posted by Ilja Preuss
You're confusing HashSet with HashMap.

I was just going by the API documentation, which implies that HashSet works by implementing a HashMap in its background, but I take your point.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Tom Griffith:
I don't know, I guess a hashcode is an indexer


Sort of, yes.

...a unique key attached to each object in a collection


No, not unique - that would hardly be possible for Strings, for example.
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
Originally posted by Ilja Preuss:
No, not unique - that would hardly be possible for Strings, for example.

Exactly. Hashing works by using the hash value as a first-pass efficient comparator of equality. If two objects have the same hash value, they might be equal, and equals() is used to make the determination. If the hash values are different, they cannot be equal, and thus a possibly-expensive call to equals() can be avoided.

That is why you must override both hashCode() and equals() for your own classes. String does this for you already, making them good hashing keys. The key is to make your hashing function such that it distributes values uniformly throughout the full range of int values and also makes two slightly different objects have different hash values.

On a side note, since you're selecting these from Oracle, why not select distinct values and let Oracle handle it? Are you worried about pounding the database? If you expect a lot of duplicates, then you need to weigh the cost of doing DISTINCT in Oracle versus bringing back more data over the wire than you need.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Duplicate elements in HashSet