I am reading a file which contains EBCIDC data (English and Arabic) in binary format. My basic task is to convert this data in UTF-8 format.
I am able to read the data byte by byte. For each byte which is read I am able to find the Hex value for EBCIDC. So I am able to locate the corresponding Hex value in ASCII for (English characters and Numerals) and also for Arabic (using code page PC 1256).
Now my problem is how should I write the Hex Values (ASCII hex and Arabic Hex) to the file so that I should get the corresponding text on my o/p file.
Another option is to use java.nio.charset package. It contains clases to encode and decode characters from one character set to another.
Good luck, comrade!
Joined: Jan 24, 2005
Foll is my code. Here I am reading bytes of data,specifying that it is in cp420 (EBCIDC Arabic) format and then writing to the o/p file in UTF-8 format.
However, there seems to be some problem.There are some junk characters getting written to the file esp the one's where the hex value is alphanumeric for ex 8D,8C etc. If the hex value is numeric then the o/p is correct.
What am I doing wrong in the code.
Also I need to insert a carriage return after every bytes of data read.
If you are treating characters, why do you read the file using byte streams?. You want to convert a file from a format to another, use Reader and Writer classes instead. It's simpler. Unless you want to write you own decoder/encoder.
Simply open the file using the Cp420 enconding and write it to a file using the UTF-8 enconding.