permaculture playing cards*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes character encoding issue Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "character encoding issue" Watch "character encoding issue" New topic
Author

character encoding issue

Huan Niu
Greenhorn

Joined: Sep 21, 2007
Posts: 11
In the assignment description, there is some kind of information:

... All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII.

Please notice it is "8 bit US ASCII".

So when I use string.getBytes( charsetName ), I want to put a proper charset to get bytes for the string. I looked up in the Java API 1.5 class Charset, and it says:


US-ASCII Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
...
UTF-8 Eight-bit UCS Transformation Format


I am wondering which one I should use? or any other suggestions?

Thanks a lot.
jesal dosa
Ranch Hand

Joined: Jun 25, 2007
Posts: 46
use it like this

String fieldValue = new String(field_name, "US-ASCII");
mohamed sulibi
Ranch Hand

Joined: Sep 04, 2005
Posts: 169
hi all;

i want ask also is it true to use the following:

byte[] fieldByte = ....;
String field = new String(fieldByte);

???

best regards
m_darim
Huan Niu
Greenhorn

Joined: Sep 21, 2007
Posts: 11
Thanks for reply.

use it like this

String fieldValue = new String(field_name, "US-ASCII");


But the "US-ASCII" is 7-bit not 8-bit.

Does this fulfil the requirement?
[ October 16, 2007: Message edited by: Huan Niu ]
jesal dosa
Ranch Hand

Joined: Jun 25, 2007
Posts: 46
yep "US-ASCII" is 7-bit if you do a search on the in forum you will find that "US-ASCII" is the correct on to use, i can not remember what i searched on but a couple weeks, but someone confirmed it by asking Sun. In a reply

I hope this helps
Huan Niu
Greenhorn

Joined: Sep 21, 2007
Posts: 11
Hi, jesal

That is what I'm exactly expecting.

Thank you very much.
Edwin Dalorzo
Ranch Hand

Joined: Dec 31, 2004
Posts: 961
My specification says I should use 8-bit character encoding. If I used a 7-bit character encoding some characters would not be representable.

Some 8-bit character encodings that you can use are all the ISO-859 family, like:

ISO-8859-1
ISO-8859-2
ISO-8859-3
ISO-8859-4
ISO-8859-5
ISO-8859-7
ISO-8859-9
ISO-8859-13
ISO-8859-15

Also the windows-1252 also known as Cp1252.

I would never recommend to use US-ASCII, since it is 7-bit character encoding, and that is not what the specification requires.

See Java Supported Encondings
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: character encoding issue