• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Case-Sensitive sorting using Collator

 
Ravindranath Chowdary
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ranchers,
This is regarding the sorting issue I am facing with java.text.Collator class.
I am using java.text.Collator to sort the Strings which can have English/Non-English characters.
If I pass list with two values {'a' , 'A'}, the output for me should be {'A', 'a'}.

I tried by setting strength as PRIMARY, SECONDARY, TERITIARY, IDENTICAL but it has not worked for me.
Also, I have tried RuleBasedCollator where I can define which character should come after the other like A < B < ....Z < a < b <c...; But this option force me to provide the rules for all the languages that we support. So, I cannot use this option.

Sampe code:
---------------
import java.util.*;
import java.text.Collator;

class Test {
public static void main(String[] args) {
ArrayList><String> list = new ArrayList<String>();
list.add("a");
list.add("A");

Collections.sort(list, Collator.getInstance());

System.out.println(list);
}
}

Output: {a, A}
Expected Output: {A, a}

Could some one suggest me how to get case-sensitive sorting using Collator.

Thanks,
Ravindra
 
Christophe Verré
Sheriff
Posts: 14691
16
Eclipse IDE Ubuntu VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Please UseCodeTags the next time you post some code.
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure there's an easy way to do this.
 
Mike Simmons
Ranch Hand
Posts: 3028
10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For the Collator I get from Collator.getInstance(), the order is case-sensitive. It just doesn't match the order you seem to want - it puts uppercase after lowercase, rather than before. However I'm not sure about that, since you don't really specify enough to tell us what you actually want. 'A' should be before 'a', OK. What about 'B' vs 'a'? Or 'Ä' vs. 'a'? I can't tell from your example what you think the order should be.

One possible solution is to simply reverse the case of all the characters before you sort the list, then reverse again after you sort. E.g.

If that doesn't work for you, you probably need to give us more info on how the results are different from what you expect.
 
Mike Simmons
Ranch Hand
Posts: 3028
10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It would probably be more correct to use code points rather than chars, given the limitations of 16-bit chars for Unicode. But this should be close enough to give the idea.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic