• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Reading your Data File

 
Ranch Hand
Posts: 590
Eclipse IDE Chrome Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Before I started this assignment I never had a need to deal with files on a byte level. So here's some information that may be useful for others to get them started on understanding the contents of the data file that is supplied as part of the assignment.

Understanding The Instructions...

Your instructions of the contents of the data file will contain information like below:

Data file Format

The format of data in the database file is as follows:
Start of file 4 byte numeric, magic cookie value. Identifies this as a data file 4 byte numeric, total overall length in bytes of each record 2 byte numeric, number of fields in each record


I'm guessing the poor structure of this description is intentional. But when you know what is in the file you can easily see that the extract above is simply saying the following:

  • 4 byte numeric, magic cookie value. Identifies this as a data file
  • 4 byte numeric Total overall length in bytes of each record
  • 2 byte numeric number of fields in each record


  • Understanding The File Contents...

    So these three lines above give a description of the first ten bytes in the file. Now, to get more familiar with the contents of the file you'll get there quicker by viewing the file in a hex editor rather than writing some code. A hex editor basically lets you view the contents of the file as a list of bytes - see Wikipedia for a good description of what a hex editor is.

    Google will show up many hex editors if you search, but here is one I found with good documentation. So download this and install it:

  • Hex Editor http://www.flexhex.com

  • Then when you have it installed take a read through their documentation which describes how to use a Hex Editor

  • Using Hex Editor http://www.flexhex.com/docs/howtos/hex-editing.phtml

  • You would also benefit from reading up on how to convert hex values to decimal. But to save you the bother of working this out manually here are two useful sites:

  • Hex to Decimal http://easycalculation.com/hex-converter.php
  • Hex to String http://www.stringfunction.com/hex-string.html

  • The first one will convert a hex value to decimal where as the second site will convert a hex value into a string. Most of the values in your data file will be numbers and strings so these two convertors will serve you well.

    Now open up your data file in the hex editor. You will see a lots of lines like this:

    00 00 01 01 00 00 00 9F 00 07 00 04 6E 61 6D 65


    Each of these two characters grouped together represent one byte in hexidecimal. So if we take a look at what the description of the data file format told us...

  • The first four bytes represent the magic cookie.

  • The first four bytes are 00 00 01 01
  • So plug 00000101 (don't leave any spaces) into http://easycalculation.com/hex-converter.php
  • The convertor will give you back the value 257

  • So the value of your magic cookie is 257


  • Repeat this process for the rest of data file format description, taking the next group of bytes as outlined in the data file format description, plugging the values from the hex editor for these bytes into one of the converters listed above, and getting the decimal or string representation.

    This will then give you a good idea of how the file is structured. So when it comes to manipulating the file in Java code and using the file pointer to seek() to a certain position in the file you should have a good understanding of what is actually happening down at the level of the bytes in the file.

    Using Java to Read the File...

    Once you have a good understanding of the contents and structure of the file you will then want to use this knowledge in your Java code to read the contents of the file. So to take our example above of the magic cookie value. The following code will read the value of the magic cookie:


    So how did I know to use the DataInputStream class? Well, firstly the instructions mention this class, see extract below. But also, I only want to read the first few bytes in the file. I don't want to navigate about the file - but when I do I will take a look at using the class RandomAccessFile.

    All numeric values are stored in the header information use the formats of the DataInputStream and DataOutputStream classes


    So how did we know to use the readInt() method to read the magic cookie value? Well, we should know from our Java Programmer certification how many bytes a particular data type in Java takes up. This page here on Primitive Data Types will give you the answer:

    int: The int data type is a 32-bit signed two's complement integer.


    Now, a byte contains 8 bits, so 32 bits is equal to 4 bytes. This tells us that we need a method that will read the numeric value of 4 bytes from the file - we know it is numeric from the data file format description:

  • 4 byte numeric, magic cookie value. Identifies this as a data file

  • So if we now take a look at the DataInputStream class we will see that it provides a readInt() method, which is described as follows in the JavaDoc:

    public final int readInt() throws IOException

    See the general contract of the readInt method of DataInput.

    Bytes for this operation are read from the contained input stream.

    Specified by:
    readInt in interface DataInput

    Returns:
    the next four bytes of this input stream
    , interpreted as an int.


    The important bits are highlighted in red. Firstly we see that it returns a int, so this satisfies the description of the magic cookie value in our data file format description of a numeric value. Secondly it returns the next four bytes - which also satisfies the description of the magic cookie value from the data file format description.

    Hopefully people will find this useful when deciphering the data file format description and figuring out the content of the file!
     
    reply
      Bookmark Topic Watch Topic
    • New Topic