• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Devaka Cooray
  • Tim Cooke
Sheriffs:
  • Rob Spoor
  • Liutauras Vilda
  • paul wheaton
Saloon Keepers:
  • Tim Holloway
  • Tim Moores
  • Mikalai Zaikin
  • Carey Brown
  • Piet Souris
Bartenders:
  • Stephan van Hulst

conversion problem (charset ?)

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I need to get a word document from a harddrive and insert the document to a blob column in an Oracle database.
Once the doc is in the DB, the users can open it using winword. To achieve the gaol, based ont the framework i must use, we retrieve the blob, write it to the disk and then open it with winword vua the runtime.exec() feature.

Upload and download of the document is OK (same size).
This works fine with simple text document but it doesn't work with complex word docuement.

After comparing the original and produced (by opening both document in notepad) we noticed that some characters are different ... for example the euro symbol in the original file is replaced by a questionmark (?) in the one produced from the blob.

Based on this, I suspect a problem regarding the charset used.
Locale.setDefault(Locale.UK) is set at the beginning of the applciation.

Do you think using a charset decoder/encoder can help ?
ANy suggestion welcome.

Thanks in advance for your help.

Pascal
 
Ranch Hand
Posts: 1923
Scala Postgres Database Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I didn't work with BLOBs till now, but from the name - binary large object - I would expect a database to save bytes as they are, and not to translate something.

How do you save and restore it?
 
author & internet detective
Posts: 41994
911
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Pascal,
We had this problem with CLOBs. The wrong encoding was set on the db server.

I'm surprised you are getting it with BLOBs as that is just data.
 
pascal monfils
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In fact, we work in a strong typed environment.
All the data is passed from a client to the server thru "bastypes" (typed objects).
The only way to get the data contained in it is to get some kind of representation wich is a string.

What i do is :
gets the string from the basetype
gets the bytes from that string
insert a record in the db using the empty_blob() function
"select ... for update" the created record
get the blob object from the resultset
create a byteinputstream with the byte[] from the string
get an outputstream on the blob
push the bytes in the blob (intream/outstream)
close the streams
commit

those base types are available on the server and on the client.
On the client, if I get the data in a basetype (open file) and then write the content of the bastype on disk under a different name, both files are strictly the same.

Regarding the database, the db is accessed by an application written in PB too and the import of documents in the blob column works fine.

I suspect the JDBC layer to use some king of locale or charset defined somewhere and to make conversion on the string part not on he bytes ...
 
pascal monfils
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
some new info ...

I identified the bytes wich are diff�rent ...
Only 75 bad for a file size of 52220 bytes !
Only 5 different bytes values identified in these 75 bytes.(-112, -115, -127, -113, -99) .

In each case, those bytes are replaced by the 63 value !

Really doesn't understand what happens !!!

Help still greatly needed
 
The moth suit and wings road is much more exciting than taxes. Or this tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic