wood burning stoves 2.0*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes writeUTF() and writeBytes() Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "writeUTF() and writeBytes()" Watch "writeUTF() and writeBytes()" New topic
Author

writeUTF() and writeBytes()

Richard Jackson
Ranch Hand

Joined: Jun 25, 2003
Posts: 128
In DataInput interface, there are writeUTF(String str) and writeBytes(String s) methods.

There is a sentence in instructions.html,
All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII.

According to request of instructions,which one is more appropriate to write String into file?

It looks simple but I have been confused for a long time.Please comment and clarify!


Regards, Richard
Anton Golovin
Ranch Hand

Joined: Jul 02, 2004
Posts: 476
Originally posted by Richard Jackson:
In DataInput interface, there are writeUTF(String str) and writeBytes(String s) methods.

There is a sentence in instructions.html,

According to request of instructions,which one is more appropriate to write String into file?

It looks simple but I have been confused for a long time.Please comment and clarify!


It's writeBytes. Converting from an ASCII String is thus:

byte[] bytes = "".toBytes("ASCII");

To a String:

byte[] bytes = new byte[recordLength];
raf.readFully(bytes);

String string = new String(bytes, "ASCII");


Anton Golovin (anton.golovin@gmail.com) SCJP, SCJD, SCBCD, SCWCD, OCEJWSD, SCEA/OCMJEA [JEE certs from Sun/Oracle]
Marlene Miller
Ranch Hand

Joined: Mar 05, 2003
Posts: 1391
Here is a test to compare writeBytes and writeUTF.

0 14 82 105 99 104 97 114 100 74 97 99 107 115 111 110
82 105 99 104 97 114 100 74 97 99 107 115 111 110

0 6 -50 -79 -50 -78 -50 -77
-79 -78 �77

writeUTF starts with two bytes for the length of the following data.
writeBytes discards the high-order 8 bits of a 16-bit Unicode character.

writeUTF also inserts extra bits.
Greek alpha is \u03b1
Display as binary 0000 0011 1011 0001
Another view 00000 01110 110001
UTF-8 drops some 0's and inserts 110 and 10: 110 01110 10 110001
Another view 11001110 10110001
Display as decimal is �50 -79

I guess we don't want to use writeUTF.
[ August 27, 2004: Message edited by: Marlene Miller ]
Marlene Miller
Ranch Hand

Joined: Mar 05, 2003
Posts: 1391
I�ve been using

byte[] b = ...
String s = new String(b);

String s = ...
byte[] b = s.getBytes();

The String class converts Unicode characters to bytes using the platform�s default character set. One could use a different character set by adding another parameter.

I guess I like the idea of converting from one character set (Unicode) to another, rather than dropping the high-order 8 bits.
[ August 27, 2004: Message edited by: Marlene Miller ]
Richard Jackson
Ranch Hand

Joined: Jun 25, 2003
Posts: 128
Thanks all of you.

I just read the posts after I got through my two weekend days.


The constructor of String class contains two arguments, the second is charsetName.
According to Charset API, we can write it as "US-ASCII" or "UTF-8".

Which one is right in this code? Please comment continously. :roll:
[ August 30, 2004: Message edited by: Richard Jackson ]
Anton Golovin
Ranch Hand

Joined: Jul 02, 2004
Posts: 476
ASCII works fine in my code; converts bytes into English admirably.
Richard Jackson
Ranch Hand

Joined: Jun 25, 2003
Posts: 128
I do same thing like Anton.
According to instructions file,
The character encoding is 8 bit US ASCII.


I modify the line of code as follows,


Am I right?
Michal Charemza
Ranch Hand

Joined: Jul 13, 2004
Posts: 86
Hi,

In the Java API, Charset, it US-ASCII listed as seven-bit ASCII, not 8. It doesn't have an 8-bit US-ASCII listed as being guaranteed to be on every Java implementation. Does this mean that the assignment spec actually means that the program is not guaranteed to run identically on every system, as it uses an encoding that might not be supported?

I suppose you could manually define your own Charset to be the 8-bit US ASCII, or some such thing, but I somehow really doubt that is what the assessors want, as it seems far beyond the scope of the assignment to me.

What about ISO-8859-1, which is listed. Could that be the 8-bit US ASCII they mean?

If anyone has any thoughts I would be greatful.


Michal
Ed Green
Greenhorn

Joined: Jun 15, 2004
Posts: 11
Hi All,
I'm currently not specifying a char set on the way in or out, and don't seem to be experiencing any problems..

tnx
peter wooster
Ranch Hand

Joined: Jun 13, 2004
Posts: 1033
Originally posted by Ed Green:
Hi All,
I'm currently not specifying a char set on the way in or out, and don't seem to be experiencing any problems..

tnx


Not specifying an encoding is incorrect, you will likely lose marks as that requests your platform default encoding, determined by Locale. Using a specific encoding gives you another Exception to handle. Here's what I do:
Michal Charemza
Ranch Hand

Joined: Jul 13, 2004
Posts: 86
Hi Peter,

Firstly, have you thought about coverting to a Sring and then using trim() and indexOf() to remove spaces and find the zero delimeter? This may result in cleaner code - although I'm not sure about the efficiency. I think I will use the String methods in mine as it is cleaner, and it is not "re-inventing the wheel".

Also, according to the Charset api, the ISO-8859-1 charset must be supported by every implementation of the Java platform. Does that mean that the UnsupportedEncodingException should never be thrown?

Perhaps an assert(false) is good here instead. I'm not really sure about assertions though, beyond what was required for the programmer exam.

Does anyone think that putting an "assert(false)" is a bad idea in a catch clause? I know I just suggested it, but the assertion and exception together do seem a bit pointless somehow: the exception is supposed to do things in case things go wrong, but the assert(false) in there means that things should never go wrong.

Michal
[ September 02, 2004: Message edited by: Michal Charemza ]
peter wooster
Ranch Hand

Joined: Jun 13, 2004
Posts: 1033
Originally posted by Michal Charemza:
Hi Peter,

Firstly, have you thought about coverting to a Sring and then using trim() and indexOf() to remove spaces and find the zero delimeter? This may result in cleaner code - although I'm not sure about the efficiency. I think I will use the String methods in mine as it is cleaner, and it is not "re-inventing the wheel".

Also, according to the Charset api, the ISO-8859-1 charset must be supported by every implementation of the Java platform. Does that mean that the UnsupportedEncodingException should never be thrown?

Perhaps an assert(false) is good here instead. I'm not really sure about assertions though, beyond what was required for the programmer exam.

Does anyone think that putting an "assert(false)" is a bad idea in a catch clause? I know I just suggested it, but the assertion and exception together do seem a bit pointless somehow: the exception is supposed to do things in case things go wrong, but the assert(false) in there means that things should never go wrong.

Michal

[ September 02, 2004: Message edited by: Michal Charemza ][/QB]


Michal
I agree, the exception should never be thrown, and you could convert to the String first and then use indexOf and trim. The simple while loop is likely to be faster unless indexOf is implemented using native code, which it might be if you use a character argument. The trim would also remove leading blanks, probably a good thing in this application.

Exceptions that should never occur should probably be chained into a runtime exception. Assertions should not be used for anything that isn't a program error meant to be caught during testing, since they are not enabled in production use.
/peter
Michal Charemza
Ranch Hand

Joined: Jul 13, 2004
Posts: 86
Originally posted by peter wooster:

Assertions ... meant to be caught during testing, since they are not enabled in production use.
/peter


Ah yes, I forgot about this point. So soon after my programmers exam... oh well.

Michal
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: writeUTF() and writeBytes()