File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes URLyBird's [Version 1.2.3] : delete flag Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "URLyBird Watch "URLyBird New topic
Author

URLyBird's [Version 1.2.3] : delete flag

Payal Shah
Ranch Hand

Joined: Jul 10, 2006
Posts: 67
1) Character encoding:
Spec says use 8 bit US ASCII.. but in the Charset API it says that in Java US ASCII is 7 bit. I read a lot of thread regarding this issue. Someone even email to sun and got back answere saying use ISO-8859-1 which is 8 bit
http://www.coderanch.com/t/186110/java-developer-SCJD/certification/Yes-yet-more-bit-bit

2) for my project.. I am reading all the fields except the delete flag (1 byte) in "UTF-8"..is that right?

3)delete flag :
when i print Oxff , I get 255.
UTF-8 encodes each character (code point) in one to four octets (8-bit bytes), with the 1-byte encoding used for the 128 US-ASCII characters.
since this number is 255, I cannot use UTF-8 encoding..

When I store this field:
writer.write(new String( new byte[]{(byte)0xFF} , "ISO-8859-1"), DBSchema.FLAG); //which store:
//signature of my write method ublic void write(String data, int length);
When I read this field:
byte [] record...
byte byteFlag = record[0]; //-1
private static final byte deleteFlag = (byte)0xFF;
if(flagValue == deleteFlag)
// deleted record
else
//not deleted record

I am using two different encoding.. I have spent lot of time thinking about this. I still do not feel confident that I am doing right.
K. Tsang
Bartender

Joined: Sep 13, 2007
Posts: 2242
    
    7

Hi Payal,
For UB, at least my version, the delete flag is the 1-byte (8-bit) right before the start of the data chunk. I use RandomAccessFile's readByte() method to read bytes. So I don't think you need to use read data as UTF-8. Do check your "Data File Format" section.

About the character encoding, think about this way. If Java default encoding is ASCII and ASCII is 7-bit, what is the 8th bit for? Correct me if I'm wrong , my understanding is ASCII uses the bottom 7 bits, leaving the most significant bit unused. Now since in computer binary sense, numbers are usually represented using 8-bit (byte) notation right? So putting 7 bits into bytes you get 8 bits or 1 byte, with the most sig bit 0. Also this 8th bit can be used as a "parity bit" for error checking (according to Wikipedia's US ASCII article)
[ December 26, 2008: Message edited by: K. Tsang ]

K. Tsang JavaRanch SCJP5 SCJD/OCM-JD OCPJP7 OCPWCD5
Payal Shah
Ranch Hand

Joined: Jul 10, 2006
Posts: 67
Thank you K. Tsang .
 
Don't get me started about those stupid light bulbs.
 
subject: URLyBird's [Version 1.2.3] : delete flag
 
Similar Threads
8-bit US ASCII encoding???
NX:[URLyBird]Small case about numeric type
SCJD Project - UrLyBird
I/O Misunderstanding (beta data file)
B&S: Yes... yet more 7 bit/8 bit US ASCII questions