| Author |
Return same hashCode value for CaseInsensitive strings
|
Vijay Kumar koganti
Ranch Hand
Joined: Jan 23, 2006
Posts: 53
|
|
Hi,
I have got two strings for ex: vijay & VIjay and when the hashcode of both the strings would be different. I would like to know how to make their hashcode values equal ie the case-sensitivity shouldn't have any affect on the hashcode value.
The requirement was driven by the fact that adding elements in the HashSet is allowing the duplicate values as it is treating the two strings for ex: vijay & VIjay are different and i need to override this behaviour.
Any idea how can i achieve this ?
regards,
vijay
|
vijay kumar k.
|
 |
Jesper de Jong
Java Cowboy
Bartender
Joined: Aug 16, 2005
Posts: 12953
|
|
Note that only changing the hashCode() method would not be sufficient; you would also need to change the equals() method. HashSet uses both hashCode() and equals() to check if two objects contain the same value.
You can't change the hashCode() and equals() methods of class String. Instead of putting strings in your HashSet, create your own class that contains the names, and put that in your HashSet:
|
Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
|
 |
Vijay Kumar koganti
Ranch Hand
Joined: Jan 23, 2006
Posts: 53
|
|
Hi Jesper,
Thanks for your response. I am aware of overriding equals together with hashcode and i was using equalsIgnoreCase(Obj.getEmpname()) for that purpose instead of the one you mentioned
return name.toLowerCase().equals(((Name) obj).name.toLowerCase()).
But now i was surprised to see that incase of TreeSet it works just fine if you are overriding the CompareTo and then just comparing the two strings with name.CompareToIgnorecase(obj.getName) .To my understanding CompareTo is used only for the sorting of the set and the hashcode is not overridden it should have added the both strings into the set . Please find the code below.
Expected End result is : TreeSet Collection : [Geeta,geeta, sita]
Actual Result is : TreeSet Collection : [geeta, sita]
can some one explain ?
vijay
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16815
|
|
Actual Result is : TreeSet Collection : [geeta, sita]
can some one explain ?
What is there to explain? TreeSet uses the comparable (and comparator) to sort -- and to determine equality. You override the comparable interface, hence, it works. Did you expect it not to behave as overridden?
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32833
|
|
Have you read the HashSet and TreeSet documentation? It gives a brief hint how the two Set implementations work, and, once you understand that, it should explain about compareTo and equals/hashCode.
By the way: declare your TreeSet as Set<Book> booktree = new TreeSet<Book>();
|
 |
Vijay Kumar koganti
Ranch Hand
Joined: Jan 23, 2006
Posts: 53
|
|
Hi campbell/Henry,
Thanks for your reply. I was in a wrong impression that CompareTo/Compare are basically for the sorting and hashcode&equals for object equality irrespective of the type of collection.After reading the TreeSet documentation i realized that in case of Tree set the CompareTo/Compare does both sorting as well as Object equality.
regards,
vijay
|
 |
Vijay Kumar koganti
Ranch Hand
Joined: Jan 23, 2006
Posts: 53
|
|
Hi campbell/Henry,
One last question, so as discussed in the above posts TreeSet decides the equality and sorting using the Comparable/Comparator.In that case if i want to order books with names say (geeta,sita,Sita) and i want them to be ordered like [geeta,sita,Sita] how would i achieve this using TreeSet because making the compareTo to ignore the case would end up my Set having only two elements ??
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32833
|
|
|
You would have to design a Comparator which gives true alphabetical ordering. Do a search for "asciibetical" and you will find some useful information. Two threads which I just found doing such a search: No 1 No 2.
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
Jesper Young wrote:Note that only changing the hashCode() method would not be sufficient; you would also need to change the equals() method. HashSet uses both hashCode() and equals() to check if two objects contain the same value.
You can't change the hashCode() and equals() methods of class String. Instead of putting strings in your HashSet, create your own class that contains the names, and put that in your HashSet:
I don't really like toLowerCase() or toUpperCase() as it potentially creates a new String. Instead, I use String.equalsIgnoreCase and my custom hashCodeIgnoreCase:
This is exactly how String.hashCode works, with the exception of the Character.toUpperCase (which does not create a new object like s.toUpperCase).
|
SCJP 1.4 - SCJP 6 - SCWCD 5
How To Ask Questions How To Answer Questions
|
 |
Vijay Kumar koganti
Ranch Hand
Joined: Jan 23, 2006
Posts: 53
|
|
Hi Campbell,
That's really a great knowledgeable post and thanks for the links.Good to learn some thing new before i go to sleep.
Good night guys.
vijay
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32833
|
|
Thank you and you're welcome But I learned lots from Rob's last post.
|
 |
Jesper de Jong
Java Cowboy
Bartender
Joined: Aug 16, 2005
Posts: 12953
|
|
Rob Prime wrote:I don't really like toLowerCase() or toUpperCase() as it potentially creates a new String.
Ok, but Java's garbage collector is very efficient at cleaning up short-lived objects such as those String objects that is only used to compare the strings or calculate the hashcode. I've been to a presentation where the presenter explained that, for example, people sometimes make their programs overcomplicated because they are afraid that creating a new object costs too much while it really isn't a problem. Java's garbage collector deals with short-lived and long-lived objects differently, and short-lived objects are cleaned up very quickly.
If your container object is immutable (as my class Name above is), then you could cache the hash code, so that you'd only have to calculate it once. Here's another version of class Name:
|
 |
 |
|
|
subject: Return same hashCode value for CaseInsensitive strings
|
|
|