aspose file tools*
The moose likes JSP and the fly likes force users to use english character set Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » JSP
Bookmark "force users to use english character set" Watch "force users to use english character set" New topic
Author

force users to use english character set

Alex Hank
Greenhorn

Joined: Apr 27, 2005
Posts: 16
The site that I work on uses several forms to collect various information. Sometimes when users use non english character sets, it can cause trouble with our systems.

How can I force users to use the english character set, or how do I convert it to the english character set before it put it in the database.

I am not sure if this is done on the client side or the server side.

can someone please shed some light on this


thanks

Alex
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42286
    
  64
You can't stop users from submitting in whatever characterset they please, unless you want to filter all data in JavaScript before it is submitted, and remove all non-us-ascii values. I would seriously question what kind of system that is which is confused by non-us-ascii characters, in this day and age. Does it not allow users to have accented names, for instance? If it must be done, you should filter the data on the server.


Ping & DNS - my free Android networking tools app
Alex Hank
Greenhorn

Joined: Apr 27, 2005
Posts: 16
our database guy (my boss) says that when he moves the data between a linux and windows machine, the records with non english character sets can cause trouble. It will stop transfering the data at the record with the non english characters.

Our database guy has a method to do the conversion in the database, however, I guess that he rather not have to do it. We use a Red Back database(I wish we used mysql)

Our database guy also says that in the past, we have also had problem printing badges that have nonenglish characters.

I am just following bosses orders.

So this should be done on the client side with javascript before submit?

Is there a method to do it with JSP?

thanks

alex
Paul Bourdeaux
Ranch Hand

Joined: May 24, 2004
Posts: 783
You could filter it on the server in whatever action you are submitting the form to. If you are submitting into a servlet, just put a filter in the doPost method and remove/replace non-english characters.


“Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.” - Rich Cook
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

RedBack is middleware for U2 databases (Universe and Unidata).
Both databases treat certain characters as control characters (mostly in the ascii 250 ~ 255 range). Somewhere in your code, you will need to check for those characters and escape them with your own sequence, change them to another character, or blow up when a user tries to enter them.


Java API J2EE API Servlet Spec JSP Spec How to ask a question... Simple Servlet Examples jsonf
Alex Hank
Greenhorn

Joined: Apr 27, 2005
Posts: 16
thanks for the advice.

I figured it out by creating the method charFix.

<%!
String replace(String s, String one, String another) {
// In a string replace one substring with another
if (s.equals("")) return "";
String res = "";
int i = s.indexOf(one,0);
int lastpos = 0;
while (i != -1) {
res += s.substring(lastpos,i) + another;
lastpos = i + one.length();
i = s.indexOf(one,lastpos);
}
res += s.substring(lastpos); // the rest
return res;
}

String charFix(String s){
//REPLACE ALL NONENGLISH CHARACTERS WITH ENGLISH CHARACTERS
if (s.equals("")) return "";
String res = s;
String badChar = "";

badChar = new Character((char)013).toString();res = replace(res,badChar, "");
badChar = new Character((char)034).toString();res = replace(res,badChar, ""); //"
badChar = new Character((char)000).toString();res = replace(res,badChar, ""); //BLANK

badChar = new Character((char)192).toString();res = replace(res,badChar, "A");
badChar = new Character((char)193).toString();res = replace(res,badChar, "A");
badChar = new Character((char)194).toString();res = replace(res,badChar, "A");
badChar = new Character((char)195).toString();res = replace(res,badChar, "A");
badChar = new Character((char)196).toString();res = replace(res,badChar, "A");
badChar = new Character((char)197).toString();res = replace(res,badChar, "A");
badChar = new Character((char)198).toString();res = replace(res,badChar, "A");

badChar = new Character((char)200).toString();res = replace(res,badChar, "E");
badChar = new Character((char)201).toString();res = replace(res,badChar, "E");
badChar = new Character((char)202).toString();res = replace(res,badChar, "E");
badChar = new Character((char)203).toString();res = replace(res,badChar, "E");

badChar = new Character((char)204).toString();res = replace(res,badChar, "I");
badChar = new Character((char)205).toString();res = replace(res,badChar, "I");
badChar = new Character((char)206).toString();res = replace(res,badChar, "I");
badChar = new Character((char)207).toString();res = replace(res,badChar, "I");

badChar = new Character((char)210).toString();res = replace(res,badChar, "O");
badChar = new Character((char)211).toString();res = replace(res,badChar, "O");
badChar = new Character((char)212).toString();res = replace(res,badChar, "O");
badChar = new Character((char)213).toString();res = replace(res,badChar, "O");
badChar = new Character((char)214).toString();res = replace(res,badChar, "O");

badChar = new Character((char)217).toString();res = replace(res,badChar, "O");
badChar = new Character((char)218).toString();res = replace(res,badChar, "O");
badChar = new Character((char)219).toString();res = replace(res,badChar, "O");
badChar = new Character((char)220).toString();res = replace(res,badChar, "O");

badChar = new Character((char)224).toString();res = replace(res,badChar, "a");
badChar = new Character((char)225).toString();res = replace(res,badChar, "a");
badChar = new Character((char)226).toString();res = replace(res,badChar, "a");
badChar = new Character((char)227).toString();res = replace(res,badChar, "a");
badChar = new Character((char)228).toString();res = replace(res,badChar, "a");
badChar = new Character((char)229).toString();res = replace(res,badChar, "a");
badChar = new Character((char)230).toString();res = replace(res,badChar, "a");

badChar = new Character((char)232).toString();res = replace(res,badChar, "e");
badChar = new Character((char)233).toString();res = replace(res,badChar, "e");
badChar = new Character((char)234).toString();res = replace(res,badChar, "e");
badChar = new Character((char)235).toString();res = replace(res,badChar, "e");

badChar = new Character((char)236).toString();res = replace(res,badChar, "i");
badChar = new Character((char)237).toString();res = replace(res,badChar, "i");
badChar = new Character((char)238).toString();res = replace(res,badChar, "i");
badChar = new Character((char)239).toString();res = replace(res,badChar, "i");

badChar = new Character((char)241).toString();res = replace(res,badChar, "n");

badChar = new Character((char)242).toString();res = replace(res,badChar, "o");
badChar = new Character((char)243).toString();res = replace(res,badChar, "o");
badChar = new Character((char)244).toString();res = replace(res,badChar, "o");
badChar = new Character((char)245).toString();res = replace(res,badChar, "o");
badChar = new Character((char)246).toString();res = replace(res,badChar, "o");

badChar = new Character((char)248).toString();res = replace(res,badChar, "u");
badChar = new Character((char)249).toString();res = replace(res,badChar, "u");
badChar = new Character((char)250).toString();res = replace(res,badChar, "u");

return res;
}
%>
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

If your business model allows you to change characters that way, this is an acceptable solution. If your customers expect to see their legal names rendered properly, you might consider creating some escape sequences for the non english characters.

If you do go with this approach, you might want to stop by the Performance forum for some tips on doing this more efficiently.

I'm guessing that looping through the string and comparing characters in a switch statement will better than re-reading the entire string for every possible character you could encounter.
Yuriy Zilbergleyt
Ranch Hand

Joined: Dec 13, 2004
Posts: 429


Should these change it to 'U' instead of 'O'? Didn't check the codes, just going by the pattern.

-Yuriy
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42286
    
  64
Of course, the ones you're replacing are just a select few. These days, people use Unicode fonts that may have thousands of characters. You might want to handle two-byte characters especially.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: force users to use english character set