aspose file tools*
The moose likes JDBC and the fly likes Help: Inserting non-english(Hebrew) characters in MS SQL SERVER 2005 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "Help: Inserting non-english(Hebrew) characters in MS SQL SERVER 2005" Watch "Help: Inserting non-english(Hebrew) characters in MS SQL SERVER 2005" New topic
Author

Help: Inserting non-english(Hebrew) characters in MS SQL SERVER 2005

Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
Hi Friends,
I am trying to insert Hebrew string into my database from my Java based tool. I am using SQL Server 2005 and the latest MS SQL jdbc driver. After i insert the string, all the Hebrew characters are in an unreadable format (some junk basically).
My requirement is to download the rows of that particular table of the database (containing Hebrew) into an EXCEL sheet, give the corresponding english translation and upload it back. But since i am getting junk characters in the excel sheet, i am unable to translate
I have tried changing the COLLATE parameter of the database and tables while creating the database, but still the issue persists.
Please help me.
Thanks in advance!
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41815
    
  62
Welcome to JavaRanch.

Is the database encoding set up to use Unicode? If it uses US-ASCII, you won't be able to store Unicode characters.

How are you determining that there's junk in the database?


Ping & DNS - my free Android networking tools app
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
Thanks for the quick response.
I am not sure how to setup database encoding to use UNICODE .
As I mentioned before, while creating the database, I have specified the COLLATE as SQL_Latin1_General_Cp1255_CI_AS.
And regarding determining if the values coming from database is junk or not, i am downloading/extracting the data from the database into a MS Excel format file and also into a html file. At both the places i get junk characters.
[ March 31, 2008: Message edited by: Rajendra Murthy ]
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
Here is my database creation script

CREATE DATABASE [JSP_TRANSLATE_DB] ON (NAME = N'JSP_TRANSLATE_DB_Data', FILENAME = N'D:\JSP_TRANSLATE_DB\JSP_TRANSLATE_DB_Data.MDF' , SIZE = 21, FILEGROWTH = 10%) LOG ON (NAME = N'JSP_TRANSLATE_DB_Log', FILENAME = N'D:\JSP_TRANSLATE_DB\JSP_TRANSLATE_DB_Log.LDF' , SIZE = 56, FILEGROWTH = 10%)
COLLATE SQL_Latin1_General_CP1255_CI_AS
GO
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41815
    
  62
And regarding determining if the values coming from database is junk or not, i am downloading/extracting the data from the database into a MS Excel format file and also into a html file. At both the places i get junk characters.


Are both files set to use CP-1255 encoding, and a font that supports the corresponding characters? If they're set to ASCII or UTF-8, naturally you'd see junk characters.
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
Originally posted by Ulf Dittmer:


Are both files set to use CP-1255 encoding, and a font that supports the corresponding characters? If they're set to ASCII or UTF-8, naturally you'd see junk characters.

I dont think there is any way to set the encoding of xls or html files to CP-1255.
So, what is the solution?
By the way, I am able to read the original Hebrew characters from the html, before its inserted into the database
[ March 31, 2008: Message edited by: Rajendra Murthy ]
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41815
    
  62
I dont think there is any way to set the encoding of xls or html files to CP-1255.

For HTML pages the charset (which is equivalent to the encoding) is set using a Content-Type meta tag. Check the source of this page to see how that is done.

I think for Excel files this needs to be done at file creation time, internally it uses Unicode.
[ March 31, 2008: Message edited by: Ulf Dittmer ]
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
Let me first thank you for your precious time

I made sure that I add
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-8">
while creating the html. It did ask for Hebrew fonts to be installed for IE and I did that. Unfortunately, the Hebrew characters are still not getting displayed properly (different junk characters getting displayed now).
Charset=ISO-8859-8 is for Hebrew, but if that corresponds to CP-1255 or not I am not sure

EDIT: Just to try, i modified ISO-8859-8 to CP-1255, didnt see any change as such
[ March 31, 2008: Message edited by: Rajendra Murthy ]
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
I tried defining the column name as nvarchar and now i am able to directly insert the Hebrew data into the database table. So now I am sure that the issue is NOT with sql server but with my PreparedStatement.



The above command works fine, as I am appending N with the Hebrew string when inserting. But the problem is in my real code, I wont be able to append N with the dynamic string variable

Here is my code piece



Please let me know how to ensure that the database recognizes that an Unicode string is being inserted.
Rajendra Murthy
Greenhorn

Joined: Mar 30, 2008
Posts: 8
any inputs please?
Christopher West
Greenhorn

Joined: May 15, 2008
Posts: 1
Are the columns that you are inserting to defined as unicode (nvarchar or nchar as opposed to varchar and char? Is the insert to SQL Server using the N'xxx' format? If you look at the following scripts you'll see that if you don't include the N at the beginning of the insert statement the unicode characrter will get stored as a question mark.

create table Unicode_Test_Chars
(Col1 nvarchar(20))

insert Unicode_Test_Chars
values (N'ب')

select * from Unicode_Test_Chars

select unicode(substring(col1,1,1)),
nchar(unicode(substring(col1,1,1)))
from Unicode_Test_Chars


delete Unicode_Test_Chars
--Insert without the N
insert Unicode_Test_Chars
values ('ب')

select * from Unicode_Test_Chars

select unicode(substring(col1,1,1)),
nchar(unicode(substring(col1,1,1)))
from Unicode_Test_Chars
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Help: Inserting non-english(Hebrew) characters in MS SQL SERVER 2005