• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Ignore directional unicodes while comparing the strings.

 
Owais Zahid
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to sort the list of strings. Some of the string can have directional unicode at the start and end of the string (like RTL unicode \u202B). I tried to use string compare but its not giving me the correct result. For example:

["A", "B", "C", "\u202BA"]

I want the sort method to sort like

A
\u202BA
B
C

Question: I heard about RuleBasedCollator. What rule i need to add in the RuleBasedCollator to have the desired sorting ? Is there any rule to ignore the list of characters all together ?
 
Paul Clapham
Sheriff
Posts: 21117
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So basically you want to ignore the U+202B character entirely, because it's not really a "character", more like an instruction for layout by fonts? Well, I had a quick look at the API docs for RuleBasedCollator and I did see something about "ignorable characters". So yes, RuleBasedCollator looks like it might work for your requirement.
 
Owais Zahid
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the reply. Can you tell me what rule i can add to make this character ignored while sorting and comparing ?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic