Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

i18n

 
Satish Kumar Kara
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am working in i18n for my application. For that i need to put some master data to my database and I need to write sql scripts for that. Please let me know the standard format. Till now I have tried with two formats.

1. I put "\uxxxx" format in database. But it was coming as it is in database. I am using this format for property file and resource boundle is working fine.

2. I put "&#xxxx;" format in database. It is working fine in UI. I also have to take PDF using itext API for those data.

I am also storing data from application which is storing as "¿¿¿¿¿ ¿¿¿ ¿¿¿¿¿" format and it is coming fine to UI.

I am using jsp, servlet and oracle 10g database.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Both the Java-specific \uxxxx and the HTML-specific &#xxxx; are bad choices for storing data in the DB. Why not store the actual Unicode (as UTF-8) in the DB and convert it to something else in case that becomes necessary (which it shouldn't - both web pages and PDFs can handle Unicode just fine) ?
 
Satish Kumar Kara
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am able to store data in unicode(utf-8) from application using JDBC. But not able to insert directly from command prompt. Can you give some hint; how to do that..
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure what you mean by "insert directly from command prompt" - are you using a command line utility to access the DB? If so, be aware that most consoles do not handle Unicode (or much of anything besides US-ASCII, or ISO-8859 at best, actually). But there are any number of GUI DB clients that can be used instead.
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Command prompts don't support unicode, so you can either use a client that does or manually encode it.
 
Satish Kumar Kara
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi am using the following steps to insert in to database

C:\> set NLS_LANG=.AL32UTF8
C:\> sqlplus username/password@database

SQL>INSERT INTO TESI_CODETABLE ( CTCODETYPE, CTCODEID, CTLANGID, CTCODENAME, CTCODEDESC,
CTRELATEDCODEID, CTCREATEDATETIME, CTOBSOLETEFLAG, CTOBSOLETEDATETIME ) VALUES (
'StampDutyType', 'MH-RG-14', 'hi_IN', 'हिन्दी में_Gift', 'NULL', 'NULL', TO_Date( '02/11/2008 04:43:57 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'N', NULL);

//It is showing the character as "??? ??_Gift" in editor

Now when i retrieved the data using select query it is showing the same "??? ??_Gift".

my db NSL properties

SQL> select * from nls_database_parameters where parameter like '%SET';

PARAMETER
------------------------------
VALUE
-------------------------------------------------------------------------------
NLS_NCHAR_CHARACTERSET
AL16UTF16

NLS_CHARACTERSET
AL32UTF8

@poul: some reference will be appreciated.. (I prefer manual encode. Please suggest some tools. So that i can directly put the encoded values for particular entry in the insert statement.)
 
Satish Kumar Kara
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I got the solution to the problem..

If the database character set is AL32UTF8 (SELECT value FROM nls_database_parameters WHERE parameter='NLS_CHARACTERSET'), then it makes no sense to use NVARCHAR2. You should store data in VARCHAR2.

You can store non-English data in two ways, if you want to use SQL*Plus scripts:

1. Create a script in Notepad, just writing the foreign characters using an appropriate keyboard layout. Store the script in UTF-8 (Save As...->Encoding->UTF-8). Use any hex editor to remove the first three bytes of the file (0xEF 0xBB 0xBF), set the NLS_LANG environment variable to .AL32UTF8, and run the script in SQL*Plus.

2. Create a script in any text editor. Instead of entering character literals directly, put them as arguments to the UNISTR function. Encode non-ASCII characters using their Unicode codes, e.g. the Hindi word "Patra"=pa+ta+virama+ra (letter) should be written as UNISTR('\092a\0924\094d\0930')

Any way thanks a lot...
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic