I am working in i18n for my application. For that i need to put some master data to my database and I need to write sql scripts for that. Please let me know the standard format. Till now I have tried with two formats.
1. I put "\uxxxx" format in database. But it was coming as it is in database. I am using this format for property file and resource boundle is working fine.
2. I put "&#xxxx;" format in database. It is working fine in UI. I also have to take PDF using itext API for those data.
I am also storing data from application which is storing as "¿¿¿¿¿ ¿¿¿ ¿¿¿¿¿" format and it is coming fine to UI.
Both the Java-specific \uxxxx and the HTML-specific xxx; are bad choices for storing data in the DB. Why not store the actual Unicode (as UTF-8) in the DB and convert it to something else in case that becomes necessary (which it shouldn't - both web pages and PDFs can handle Unicode just fine) ?
I'm not sure what you mean by "insert directly from command prompt" - are you using a command line utility to access the DB? If so, be aware that most consoles do not handle Unicode (or much of anything besides US-ASCII, or ISO-8859 at best, actually). But there are any number of GUI DB clients that can be used instead.
If the database character set is AL32UTF8 (SELECT value FROM nls_database_parameters WHERE parameter='NLS_CHARACTERSET'), then it makes no sense to use NVARCHAR2. You should store data in VARCHAR2.
You can store non-English data in two ways, if you want to use SQL*Plus scripts:
1. Create a script in Notepad, just writing the foreign characters using an appropriate keyboard layout. Store the script in UTF-8 (Save As...->Encoding->UTF-8). Use any hex editor to remove the first three bytes of the file (0xEF 0xBB 0xBF), set the NLS_LANG environment variable to .AL32UTF8, and run the script in SQL*Plus.
2. Create a script in any text editor. Instead of entering character literals directly, put them as arguments to the UNISTR function. Encode non-ASCII characters using their Unicode codes, e.g. the Hindi word "Patra"=pa+ta+virama+ra (letter) should be written as UNISTR('\092a\0924\094d\0930')