File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes URLyBird's [Version 1.2.3] : delete flag Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "URLyBird Watch "URLyBird New topic

URLyBird's [Version 1.2.3] : delete flag

Payal Shah
Ranch Hand

Joined: Jul 10, 2006
Posts: 67
1) Character encoding:
Spec says use 8 bit US ASCII.. but in the Charset API it says that in Java US ASCII is 7 bit. I read a lot of thread regarding this issue. Someone even email to sun and got back answere saying use ISO-8859-1 which is 8 bit

2) for my project.. I am reading all the fields except the delete flag (1 byte) in "UTF-8" that right?

3)delete flag :
when i print Oxff , I get 255.
UTF-8 encodes each character (code point) in one to four octets (8-bit bytes), with the 1-byte encoding used for the 128 US-ASCII characters.
since this number is 255, I cannot use UTF-8 encoding..

When I store this field:
writer.write(new String( new byte[]{(byte)0xFF} , "ISO-8859-1"), DBSchema.FLAG); //which store:
//signature of my write method ublic void write(String data, int length);
When I read this field:
byte [] record...
byte byteFlag = record[0]; //-1
private static final byte deleteFlag = (byte)0xFF;
if(flagValue == deleteFlag)
// deleted record
//not deleted record

I am using two different encoding.. I have spent lot of time thinking about this. I still do not feel confident that I am doing right.
K. Tsang

Joined: Sep 13, 2007
Posts: 3131

Hi Payal,
For UB, at least my version, the delete flag is the 1-byte (8-bit) right before the start of the data chunk. I use RandomAccessFile's readByte() method to read bytes. So I don't think you need to use read data as UTF-8. Do check your "Data File Format" section.

About the character encoding, think about this way. If Java default encoding is ASCII and ASCII is 7-bit, what is the 8th bit for? Correct me if I'm wrong , my understanding is ASCII uses the bottom 7 bits, leaving the most significant bit unused. Now since in computer binary sense, numbers are usually represented using 8-bit (byte) notation right? So putting 7 bits into bytes you get 8 bits or 1 byte, with the most sig bit 0. Also this 8th bit can be used as a "parity bit" for error checking (according to Wikipedia's US ASCII article)
[ December 26, 2008: Message edited by: K. Tsang ]

K. Tsang JavaRanch SCJP5 SCJD OCPJP7 OCPWCD5 OCPBCD5 OCPWSD5 OCMJEA5 part 1 part 2/3
Payal Shah
Ranch Hand

Joined: Jul 10, 2006
Posts: 67
Thank you K. Tsang .
I agree. Here's the link:
subject: URLyBird's [Version 1.2.3] : delete flag
It's not a secret anymore!