aspose file tools*
The moose likes I/O and Streams and the fly likes writing UCS-2 encoded files with Java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "writing UCS-2 encoded files with Java" Watch "writing UCS-2 encoded files with Java" New topic
Author

writing UCS-2 encoded files with Java

William Stafford
Ranch Hand

Joined: Dec 13, 2004
Posts: 109
Is there any way to write a UCS-2 file from a Java application?

We have a utility, written in Java, which reads and exports Oracle data as UTF-8 files which can then be imported by PostgreSQL. We would like to expand the utility to support import by Microsoft DB products which apparently do not recognize UTF-8 and instead require UCS-2.

When I try the usual way of writing out the files I get an UnsupportedEncodingException. This being the usual way:
BuffedFileWriter fileWriter =
new BufferedWriter(new OutputStreamWriter( new FileOutputStream(outFile), "UCS-2" ));

This UnsupportedEncoding sounds pretty final! Is there any way around this?

Thanks for any advice,
-=bill
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42286
    
  64
I think as long as there aren't any codepoints beyond the BMP (i.e., above 65535), UCS-2 is identical to UTF-16, so you might try that. Depending on how you process the files, a byte order mark may be necessary.


Ping & DNS - my free Android networking tools app
William Stafford
Ranch Hand

Joined: Dec 13, 2004
Posts: 109
Ulf,
Thanks for the reply. Is there anyway to know in advance what byte ordering would be used by MS SQL server? Or is this something that is typically specified on a per database basis?

Thanks again,
-=bill
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42286
    
  64
Byte order marks are only used in files, not databases. What you do need to worry about is the encoding the database uses -which by default is often ISO-8859 or even ANSI-, so you may have to change that.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: writing UCS-2 encoded files with Java