Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Agile forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

SCJD Project - UrLyBird

 
Victor Itulua
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Guys;
Below is a description of my data file format:

Data file Format
The format of data in the database file is as follows:
Start of file
4 byte numeric, magic cookie value. Identifies this as a data file
2 byte numeric, number of fields in each record
Schema description section.
Repeated for each field in a record:
1 byte numeric, length in bytes of field name
n bytes (defined by previous entry), field name
1 byte numeric, field length in bytes
end of repeating block
Data section.
Repeat to end of file:
1 byte flag. 00 implies valid record, 0xFF implies deleted record
Record containing fields in order specified in schema section, no separators between fields, each field fixed length at maximum specified in schema information
End of file
All numeric values are stored in the header information use the formats of the DataInputStream and DataOutputStream classes. All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII.

1. I am having a hard time understanding this file.
2. Is 8 bit US ASCII equivalent to UTF-8.
3. "All numeric values are stored in the header information use the formats of the DataInputStream and DataOutputStream classes" - How is this different from UTF-8, and 8 bit US ASCII.
4. What is the best and efficient way to process this file.

I will highly appreciate all contributions / suggestions to enable me pursue the SCJD certification.

Victor
 
John Stone
Ranch Hand
Posts: 332
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic