Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How do I handle string match with these Puerto Rico cities (spanish) ?

 
Steve Mutanson
Ranch Hand
Posts: 67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are couple area in Puerto Rico --- Mayaguez (where the character 'u' is spanish so it has two dots on top), and San Juan-Bayamon (the character 'o' is in spanish so there is a dash on top of it). The database in in English version so it translated into 'u' and 'o' respectively.
Now, after retrieving the data from database I need to match with some file in which the characters are stored as Spanish (just for these TWO names they keep the spanish version, actually just for these TWO characters). So the comparison fails. How do I make it work ?
thanks,
steve
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well I haven't done this myself, nor even read all of the tutorial on this - but I believe that you want the Collator class for a flexible approach to lexical comparisons.
Alternately you might write some sort of custom converter that replaces any ü with u, etc. just before performing other comparisons. But I tend to thing the Collator is the "right" way to approach this.
 
Steve Mutanson
Ranch Hand
Posts: 67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
actually let's discuss a much much simpler question -- Suppose I want to create a hashmap and use the English word as key and Spanish/French as value. Then I can easily grab them. The confusing thing is -- How do I store that Spanish or French value in the hashtable since I can't type them in ??
 
Richard Jensen
Ranch Hand
Posts: 67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Steve Mutanson:
How do I store that Spanish or French value in the hashtable since I can't type them in ??

Use the appropriate unicode values. These are converted automatically for you since Java uses 16-bit chars.

(I'm not sure if I got the exact characters you mentioned in your first post, but you get the idea).
 
Steve Mutanson
Ranch Hand
Posts: 67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Richard Jensen:

(I'm not sure if I got the exact characters you mentioned in your first post, but you get the idea).

From
http://gsu.linux.org.tr/oreilly/Java%20Enterprise/servlet/appd_01.htm
I found what I want is "\u00fc" and "\u00f3". However, when I simply do the System.out.println("\u00f3" + ", \u00fc"); it does not look like what I want. For example, \u00fc represents the "u" with 2 dots on the top, but it shows a little "n" on the top and nothing on the bottom. Have you tried any example yourself ?
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You have to realize that odd characters aren't going to print correctly when you send them to the console. Write them in a JOptionPane and make sure you have the right fonts installed to see what they look like.
 
Steve Mutanson
Ranch Hand
Posts: 67
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Folks,
thanks for input. Now, when taking and outputing such strings, my X-term window works as follows -- For the known characters it output as it is, for those unknown char, it just outputs "?". this prevents me from knowing what it is and what unicode I should use to replace it. Thus, I want to know --- Instead of outputing "?" char, how can I let X-term window output the unicode for that char ? If I can do that, then I will be able to know what special character it is.
Thanks,
steve
 
Siddharth Mehrotra
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HI, I had the same probelm on AIX, It displayed all chineese charachters as ?? , all I did was that before running the program I used to set the Locale of the session so that it understood the language, like i used to set Lang to zh_tw.Big5.
there must something similar for your choice of language
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic