Maintaining ISO-8859-1 encoding without using Java String
posted 4 years ago
Ok here's the problem, we have an Oracle database encoding data in ISO-8859-1 (aka Latin-1) and Java using UTF-16BE internally for the String class. We need to maintain Latin-1 across our entire process
Our process goes like this:
Get a list of document ids
For each doc id execute SQL query against Oracle for document meta data
For each record retrieve a specific string column and concatenate the results as we move through the recordset.
Store the result in a Hashtable and execute the next query
Combine results from Hashtable with List array containing a template
Write the result to a file.
Get the next document
I'm looking at treating all the strings as BufferedReader and using the ResultSet.getCharacterStream to retrieve the data, but I'm at a loss as to how to manage the Hashtable / List content and these pieces are critical to the operation. If I store the content as a String then Java will use UTF-16 and convert the data. I can't have that happen as the data needs to stay in the Latin-1 format from Oracle.
I think you need to take a closer look as to what it means to "maintain Latin-1 across the entire process". No doubt this is some edict from an architect somewhere. As a programmer I wouldn't interpret that as meaning I wasn't allowed to use the String type anywhere in my Java code. That (in my opinion) would be absurd. All I would do in your situation would be to make sure that the Writer I used to write to the output file used the ISO-8859-1 encoding. As long as it's ISO-8859-1 into the black box and ISO-8859-1 out of the black box I would say you're okay.