• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

.db file format problem, help please!!!

 
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Data file Format
The format of data in the database file is as follows:

Start of file
4 byte numeric, magic cookie value identifies this as a data file
(what is magic cookie? i used BufferedReader to read the file, and i did not see
any magic cookie! please tell me how to find it!)

4 byte numeric, offset to start of record zero
(what does it mean?)

2 byte numeric, number of fields in each record

Schema description section.
Repeated for each field in a record:
2 byte numeric, length in bytes of field name
n bytes (defined by previous entry), field name
2 byte numeric, field length in bytes
end of repeating block
(does the above mean that each record has 2 spaces between them, and the length of each record
is the sum of length of the each feild?)


Data section. offset into file equal to "offset to start of record zero" value
Repeat to end of file:
2 byte flag. 00 implies valid record, 0x8000 implies deleted record
(how to assign byte with '00' and '0x8000 '? is the 2 bytes flag the spaces between each record?

Record containing fields in order specified in schema section, no separators between fields,
each field fixed length at maximum specified in schema information



End of file

All numeric values are stored in the header information use the formats of the DataInputStream
and DataOutputStream classes. All text values, and all fields (which are text only), contain
only 8 bit characters, null terminated if less than the maximum length for the field. The character
encoding is 8 bit US ASCII.
(what is header section? and how to get the header section? should i use DataInputStream and DataOutputStream to read and write the
.db file? what does "null terminated if less than the maximum length for the field" mean?
what is "The character encoding is 8 bit US ASCII"?)

thank you in advance
 
Ranch Hand
Posts: 44
1
Oracle Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there

Make a copy of you data file. Open it with work. Each character you see in the file is 1 byte including what is percieved by the eye as white space. You will find 2 different values in the the percieved white space.

1. what is magic cookie?
From what I gather a magic cookie in the text of this assignment is just a signature to state this is the file to be using. If this signature is different that stop the database loading process. I'm read each byte individually for this one.

2. 4 byte numeric, offset to start of record zero(what does it mean?)
I have something different. My text state 4 bytes to determine the size of each record. I'm reading this as a 32 integer.

3. Schema description section...
These are not records, these are values that define the columns in you file. The first 2 bytes (16 bit number) state how long the name of the columns is (eg. 4name). The the next number of bytes as represented by the previous value is the column name. And then finally the 16 bit (2 bytes) number representing the number of bytes the actually data in the column will be. e.g. name I think is 32 bits long etc. There are no white spaces between each of these fields.

4 what is header section?
This is the section of data you will see that match the details in your
"Start of file" and "Schema description section" sections.

5.should i use DataInputStream and DataOutputStream
Your choice. I'm using RandomAccessFile. Infact I get the impression most people here are using RandomAccessFile as opposed to DataInputStream and DataOutputStream.

6. what does "null terminated if less than the maximum length for the field" mean?
You will see white space in the file e.g. if the name column is 32 bytes then from the start of the field count 32 times while moving the cursor to the right. ie. if you have a name of James in this column theire will be 27 white space place holders.

7. what is "The character encoding is 8 bit US ASCII"?)
Bet to google this one. There is easy to read and understand text at wikipedia (search for UTF-8).

hope this helps, and good luck.
Q
 
xi brian
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thank you for your help
 
Ranch Hand
Posts: 40
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

All numeric values are stored in the header information use the formats of the DataInputStream
and DataOutputStream classes. All text values, and all fields (which are text only), contain
only 8 bit characters, null terminated if less than the maximum length for the field. The character
encoding is 8 bit US ASCII.



it seems the numeric values in the header information should read use the DataInputStream/DataOutputStream, i tried use the RandomAccessFile, it's returned some error chars.(square)

Thank you,
Ronggen
 
Quintin Stephenson
Ranch Hand
Posts: 44
1
Oracle Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there

Why do you feel you are getting error chars? Read each section of the header file as stipulated.

i.e. the first line of the datafile specification states
"4 byte numeric, magic cookie value. Identifies this as a data file"
So just read the first 4 bytes and return the value as an interger. Remember integers are 32 bit (4 byte) values. I used the readInt() method. I chose to read it as an integer because the integer value of the 4 byte value will always be the same. You don't actually care what is said in these four byte just what is the value of the bytes represents. You will have stored the value somewhere (either in a class or a properties file), which can be used to compare against.

The second value they want you to get is an interger so use readInt again.

For 2 byte numeric values I use readShort.

For the actual strings I use read() that has and out paramater of byte array. I convert this to a String value directly from the byte array. and so forth.

Cheers
Q
 
xi brian
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i used the same way by using readint and readshort which works perfectly

and no errors
 
xi brian
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII."

since i used randomaccessfile, do i have to know or use 8 bit US ASCII?
 
Ronggen Liu
Ranch Hand
Posts: 40
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you, guys, i got it...

-Ronggen
 
Don't MAKE me come back there with this tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic