Hi, My users are submitting text via a jsp web form, and I write the text to an sql server database, column type text. Sometimes certain characters, which look like they should be apostrophes, are written to the db as the character '?'. (At least if I use Query Analyzer on the database, an SQL query returns text containing '?'. Perhaps it is stored in the db as the original character?) Other apostrophes seem to be stored OK, and anyway I escape apostrophes before submitting the text. I think users are pasting text to the form from Microsoft Word, so perhaps some unusual characters are being submitted that way.
Question 1: For future submissions, will using a prepared statement help to solve this problem? If not, what other solutions are there?
Question 2: For existing submissions, can I retrieve the original unusual chars from the db, or have they been replaced in the db by '?'.
The problem is likely one of character encodings. What encoding are you using in the web app, and which one is used in the database? If it's not the same everywhere, you're bound to run into trouble, unless you convert appropriately.
#1) The kind of Statement you use makes no difference.
Hi, The meta tag in my web page is this: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> I have no control over this setting as my application is a plugin in another web application.
I don't know how to find out my SQL Server encoding settings.
I am using the Free JTDS driver, which looks as though it should be doing the character translating for me. Any ideas? Mike
Joined: Mar 22, 2005
You specify the encoding of your response, but the important thing is the encoding of the request; I'm not sure if there isn't a client setting to change that. You can test it using the HttpServletRequest.getCharacterEncoding method. I'm also not sure if the JDBC driver will automatically change encodings; how is it to know what encoding you use, even if it knows what the DB expects? Usually you need to specify the character encoding when you create a DB, and generally it can't be changed later. Try to find out which one yours uses.
Joined: Aug 09, 2005
The request is also ISO-8859-1.
Looking at my MSSQL database, the "Collation name" is SQL_Latin1_General_CP1_CI_AS. Would this be the encoding type?
Would it be easier to just URLEncode all the text in UTF-8 before sending it to the database, then decode it when pulling it out again?
Mike [ November 02, 2005: Message edited by: Mikey Kelly ]
Joined: Mar 22, 2005
Collation specifies the sorting order; it's only marginally related to the encoding.
URL-encoding might work, but seems like a bad hack, because then you don't have the actual data in the database, and can't apply operators like "like". Dig a little deeper into the documentation, and find out which encoding your DB actually uses, and then store it that way.