GeeCON Prague 2014*
The moose likes XML and Related Technologies and the fly likes Building generic XML for  multilingual data Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Building generic XML for  multilingual data" Watch "Building generic XML for  multilingual data" New topic
Author

Building generic XML for multilingual data

Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
Hey ,
I have a general routine to build an XML file by getting the data from DB and displaying it on browser.
My DB is multilingual and suppose the data in DB is in chinese, the XML generates junk data for the Chinese data and ??? is displayed on the browser for those data.
Can somebody help me out how can I build a generic XML to support multilingual data( not only chinese...it can be spanish, german and japanese).
THX in advance
Balaji Loganathan
author and deputy
Bartender

Joined: Jul 13, 2001
Posts: 3150
Why do you want to display the XML in browser ?
In any case!... Either you have to specify the ? character in Unicode format in your generated XML file
example & # 2 5 2; => ü
& # 2 2 8; => ä please delete space while using it.
Or feed the xml to the xsl and with font tag.
example
<font name="chineseArial"><xsl:value-of select="\foo\foofoo"/></font>


Spritle Software Blogs
Simon Harvey
Ranch Hand

Joined: Jan 26, 2003
Posts: 79
Mary,
This is an area that i am getting to grips with at the moment. The thing that strikes me is that you havent mentioned character sets. The typical character sets used by many technologies including java only contains characters for the western lanaguages, and even then probably not all of them. To show all characters you'll need to identify a character set that can handle the characters you want and you will also need to tell the browser what character set you are using to send data to the client.
Look into internationalization and localisation both within the context of java and more generally
Good luck. I hope thats what you need
Simon
Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
Guys...Thx a lot.

WHat I need is I need to generate the XML (which might contain a mixture of English, Spanish , Chinese data) and I need to send that XML file thru E-mail .
FOr sample purpose I want to see it on the browser to verify what I have done is correct or not.
WHile viewing on the browser ..the chinese data is displayed as ??? (or some junk char).
So If I specify the encoding as UTF-8 in the XML..will it solve the problem or is there something I need to do more
Pls reply
Simon Harvey
Ranch Hand

Joined: Jan 26, 2003
Posts: 79
Mary,
I'm afraid I don't have enough time to solve your problem for you, but I can definately provide you with some directions to go in.
http://java.sun.com/j2se/1.3/docs/guide/intl/
http://developer.java.sun.com/developer/technicalArticles/Intl/
These two areas will provide you with a lot of information. Now, there will be some superfluous information in there but just search for the stuff on character encoding and you'll be fine.
One other thing. I think UTF 8 is the defail western encoding, which is exactly what you don't need. I'm not sure but it might be. Also, I think even though you may be using unicode, somewhere along the line, maybe in the browser or in javamail, only a certain number of character get sent. Its a fairly confusing area but I'm sure what you are asking is perfectly do-able and won't be to hard once you actually find whatyou need.
Post back if you need any more help and let me know if you hit the jackpot
Simon
Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
One thing to remember is that I cannot have resource bundle entries for all the data in DB.
am setting this response().setHeader("Content-Type","text/xml; charset=UTF-8");
to convert to UTF-8
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
I'm not too knowledgeable on character sets and encodings but shouldn't we use UTF-16 for Chinese and other non-western characters?


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
guyz any luck
Balaji Loganathan
author and deputy
Bartender

Joined: Jul 13, 2001
Posts: 3150
Originally posted by Mary, Cole:

So If I specify the encoding as UTF-8 in the XML..will it solve the problem or is there something I need to do more
Pls reply

Please read my previous post again, that the only way i see now!..
You have to replace the characters to theunicode format in your xml or in your DB. Then you can directly view this xml document in IE or Mozilla. Netscape is not fully supportin all unicodes.
Once done(conversion), don't attempt to open the xml file using notepad like editor, use xmlspy or any unicode supporting editors.
Simon Harvey
Ranch Hand

Joined: Jan 26, 2003
Posts: 79
Hi Mary,
A couple of thoughts and questions:
I'm guessing this would be to easy but have you gone to View->Encoding in IE and changed the encoding that IE is using to display your info.
You see, the fact that IE is making your text look all buggered, says to me that the data that you are presenting to IE is in a format that IE isn't expecting nor understands. This could be really good news, because it sounds like you are storing and processing your data appropriately right up to the point that IE displays it. IE could be the problem - Use the View->Encoding options to tell it how to interpret what you are giving it?
Simon
Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
Hey Balaji,
Can you elabprate more of how it will solve if I use XSL.
Pls give me a detailed example
Thx in advance
Balaji Loganathan
author and deputy
Bartender

Joined: Jul 13, 2001
Posts: 3150
Originally posted by Mary, Cole:
Hey Balaji,
Can you elabprate more of how it will solve if I use XSL.
Pls give me a detailed example
Thx in advance


By XSL:
Is your input xml contains come chinese/japanese characters, then ie may show this as ? or as a rectangle.
To avoid this, you can specify the font name for these values.
<font name="chineseArial"><xsl:value-of select="\foo\fooElem"/></font>
chineseArial is font that you had installed in your pc(say C:\winnt\fonts), this is similar like using arial, times new roman, ms sans serif" etc.,
You might have seen this in a normal website where you have to download and install a specific font file inorder see the non-english characters.
If you are still confused, please reply back with the problem.
mary morris
Ranch Hand

Joined: Mar 16, 2002
Posts: 97
Mary,
I have been reading this posting with interest. Have you solved it yet. I am looking for displaying a xml file in english and the same file in french. The thing is though the english file has to be translated to french - by a translator, then we need a way to not mess up the french file. I assume all these languages you are dealing with have are translated?
MM
Mary Cole
Ranch Hand

Joined: Dec 02, 2000
Posts: 362
Hey guys,
Thx a lot.The XSL funda works good for me
 
GeeCON Prague 2014
 
subject: Building generic XML for multilingual data