File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes .db file format problem, help please!!! Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark ".db file format problem, help please!!!" Watch ".db file format problem, help please!!!" New topic
Author

.db file format problem, help please!!!

xi brian
Ranch Hand

Joined: Mar 06, 2008
Posts: 30
Data file Format
The format of data in the database file is as follows:

Start of file
4 byte numeric, magic cookie value identifies this as a data file
(what is magic cookie? i used BufferedReader to read the file, and i did not see
any magic cookie! please tell me how to find it!)

4 byte numeric, offset to start of record zero
(what does it mean?)

2 byte numeric, number of fields in each record

Schema description section.
Repeated for each field in a record:
2 byte numeric, length in bytes of field name
n bytes (defined by previous entry), field name
2 byte numeric, field length in bytes
end of repeating block
(does the above mean that each record has 2 spaces between them, and the length of each record
is the sum of length of the each feild?)


Data section. offset into file equal to "offset to start of record zero" value
Repeat to end of file:
2 byte flag. 00 implies valid record, 0x8000 implies deleted record
(how to assign byte with '00' and '0x8000 '? is the 2 bytes flag the spaces between each record?

Record containing fields in order specified in schema section, no separators between fields,
each field fixed length at maximum specified in schema information



End of file

All numeric values are stored in the header information use the formats of the DataInputStream
and DataOutputStream classes. All text values, and all fields (which are text only), contain
only 8 bit characters, null terminated if less than the maximum length for the field. The character
encoding is 8 bit US ASCII.
(what is header section? and how to get the header section? should i use DataInputStream and DataOutputStream to read and write the
.db file? what does "null terminated if less than the maximum length for the field" mean?
what is "The character encoding is 8 bit US ASCII"?)

thank you in advance
Quintin Stephenson
Ranch Hand

Joined: Nov 16, 2006
Posts: 40
Hi there

Make a copy of you data file. Open it with work. Each character you see in the file is 1 byte including what is percieved by the eye as white space. You will find 2 different values in the the percieved white space.

1. what is magic cookie?
From what I gather a magic cookie in the text of this assignment is just a signature to state this is the file to be using. If this signature is different that stop the database loading process. I'm read each byte individually for this one.

2. 4 byte numeric, offset to start of record zero(what does it mean?)
I have something different. My text state 4 bytes to determine the size of each record. I'm reading this as a 32 integer.

3. Schema description section...
These are not records, these are values that define the columns in you file. The first 2 bytes (16 bit number) state how long the name of the columns is (eg. 4name). The the next number of bytes as represented by the previous value is the column name. And then finally the 16 bit (2 bytes) number representing the number of bytes the actually data in the column will be. e.g. name I think is 32 bits long etc. There are no white spaces between each of these fields.

4 what is header section?
This is the section of data you will see that match the details in your
"Start of file" and "Schema description section" sections.

5.should i use DataInputStream and DataOutputStream
Your choice. I'm using RandomAccessFile. Infact I get the impression most people here are using RandomAccessFile as opposed to DataInputStream and DataOutputStream.

6. what does "null terminated if less than the maximum length for the field" mean?
You will see white space in the file e.g. if the name column is 32 bytes then from the start of the field count 32 times while moving the cursor to the right. ie. if you have a name of James in this column theire will be 27 white space place holders.

7. what is "The character encoding is 8 bit US ASCII"?)
Bet to google this one. There is easy to read and understand text at wikipedia (search for UTF-8).

hope this helps, and good luck.
Q


If at first you don't succeed, try, try again. If you don't try you have failed.
xi brian
Ranch Hand

Joined: Mar 06, 2008
Posts: 30
thank you for your help
Ronggen Liu
Ranch Hand

Joined: Jul 29, 2007
Posts: 40
All numeric values are stored in the header information use the formats of the DataInputStream
and DataOutputStream classes. All text values, and all fields (which are text only), contain
only 8 bit characters, null terminated if less than the maximum length for the field. The character
encoding is 8 bit US ASCII.


it seems the numeric values in the header information should read use the DataInputStream/DataOutputStream, i tried use the RandomAccessFile, it's returned some error chars.(square)

Thank you,
Ronggen


SCJP 1.4,<br />SCJD Java 2,<br />...
Quintin Stephenson
Ranch Hand

Joined: Nov 16, 2006
Posts: 40
Hi there

Why do you feel you are getting error chars? Read each section of the header file as stipulated.

i.e. the first line of the datafile specification states
"4 byte numeric, magic cookie value. Identifies this as a data file"
So just read the first 4 bytes and return the value as an interger. Remember integers are 32 bit (4 byte) values. I used the readInt() method. I chose to read it as an integer because the integer value of the 4 byte value will always be the same. You don't actually care what is said in these four byte just what is the value of the bytes represents. You will have stored the value somewhere (either in a class or a properties file), which can be used to compare against.

The second value they want you to get is an interger so use readInt again.

For 2 byte numeric values I use readShort.

For the actual strings I use read() that has and out paramater of byte array. I convert this to a String value directly from the byte array. and so forth.

Cheers
Q
xi brian
Ranch Hand

Joined: Mar 06, 2008
Posts: 30
i used the same way by using readint and readshort which works perfectly

and no errors
xi brian
Ranch Hand

Joined: Mar 06, 2008
Posts: 30
"All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII."

since i used randomaccessfile, do i have to know or use 8 bit US ASCII?
Ronggen Liu
Ranch Hand

Joined: Jul 29, 2007
Posts: 40
Thank you, guys, i got it...

-Ronggen
 
Don't get me started about those stupid light bulbs.
 
subject: .db file format problem, help please!!!