"...I see no "must" for verifying the format of the file to be accessed."
Although there is no "must", there is a cookie value that identifies the file as a valid db file. If you think about it, if you don't verify that the file is a valid db file, then you can pull in bytes from a non-valid file and interpret them as valid for a db file.
Joined: Oct 17, 2004
Although there is no "must", there is a cookie value that identifies the file as a valid db file.
Ok, so verify the cookie value. Does anyone go beyond that to verify the file contains valid records within a valid format?
Yes. I verified the whole db according to the schema at init. After that, only modification to the db will be verified. This is the way to enhanced the database integrity. And, it is not hard to implement.
At first, I hardcoded the database file cookie value in my source code but then I decided that it was a bad idea, because then that limits my application to only being used by my specific database. Therefore, I rethought my decision and I made a custom FileFilter so that it only allows users to select a file with a .db extension. If for some reason there is a file with a .db extension and it isn't the expected database file, somewhere along reading in the schema an IOException will be caught and rethrown. That is how I decided to implement it.
Originally posted by Daniel Simpson: At first, I hardcoded the database file cookie value in my source code but then I decided that it was a bad idea, because then that limits my application to only being used by my specific database. Therefore, I rethought my decision and I made a custom FileFilter so that it only allows users to select a file with a .db extension. If for some reason there is a file with a .db extension and it isn't the expected database file, somewhere along reading in the schema an IOException will be caught and rethrown. That is how I decided to implement it.
I chose the opposite approach. I allow any file name but insist that the first 4 bytes must match the cookie found in the file provided by Sun.
Joined: Oct 17, 2004
I asked a few of my colleagues at work on this matter, and come to this train of thought.
Based on the assignment specification, it came down to the assumption the customers will only use the application I created. In addition, I am coming down to a general/overall policy that I will put in my documentation:
A. From the adoption of the application system is assumed the customer will use this application system or any previous applications that support the database format as defined in the schema section of the specification. B. All valid database files must be given names with the .db file extension. C. All valid database files must have the file header cookie, defining it as a database file supported by this application. D. Files meeting the objectives specified by items B and C will be assumed to be in a valid database format, containing valid records. E. This application will adhere to the customer's application requirement specification's database schema. In doing so, this application will enforce this policy by inserting only valid records and correctly manipulate already-existing and newly inserted records as defined by the schema.
Basically, I'm saying "give me good stuff with the right header and extension, and my application will continue to give you good stuff."
I tried to look at it from the user's point of view.
The user has a database file and wants to use it. The sample db file we received has a cookie and a certain filename and extension, but we have no guarantee that the real database file has the same name, extension and cookie value. So the user loads his database file and gets an error message, saying it is not a valid database file, and contacts the company. The company is convinced the database file was valid: it obeyed the database schema. The person in charge of making/sending the database file gets told to send it again: maybe something went wrong during zipping it or mailing it. And again the user gets an error message. Eventually they read the user guide and find out that the programmer decided to add a few restrictions. User upset, because he has to rename or even edit his database file before he can use it. Or the IT department has to do this and send it to the user again.
Another annoyance I can think of is when the user accidently selects a file that was not a database file, and gets a RuntimeException.
So I chose to be flexible regarding the cookie and filename, and strict regarding the schema. I do allow database files with added or swaped columns, different widths, but no deletion of the columns I need.
The assignment says "4 byte numeric, magic cookie value. Identifies this as a data file." If it didn't matter what the 4 byte numeric was, it wouldn't really serve the purpose of "identifying" data files, would it? I think the programmer would be within his/her rights to reject any file that doesn't start with the same 4-byte numeric as is in the given file. OTOH, the assignment doesn't specifically say you have to verify the cookie value, so I'll probably hard-code the magic cookie value and issue a warning if it doesn't match.
Does the instuction specify the value of the cookie? What do you compare it with? It does not even say that it is a constant. It can be some generated number that satisfies a secret algorithm. I would just read the cookie and do nothing about it, and leave a comment in the code for maintenance purposes. On reading the data file, any unexpected data should cause the server program to stop and back out.