Hi everybody, In my instructions, under "Data file Format", I read :
Start of file 4 byte numeric, magic cookie value. Identifies this as a data file
From those words, I only see three possible interpretations :
It's just for your information, simply ignore it.
If the value read is not the one you expect, throw some NotADataFileException extending RuntimeException.
That magic number is the beginning of a versioning scheme : if you can handle the value read, interpret the header accordingly, else throw some NotADataFileException extending RuntimeException.
I think that solution 1 is bad. If we could simply ignore those bytes, the instructions would have been written like : "Skip the first four bytes of the file". Right ? The second one is acceptable. The third one is so tempting that I consider to call from my public open() method a private method called "private interpretHeader(int fileSignature)", entering there in a switch/case construct, the default statement being "throw new NotADataFileException();" How did you handle that information ? Regards, Phil. [ July 10, 2003: Message edited by: Philippe Maquet ]
Hi Philippe , I have done following: In Constructor of Data I establish connection to the file and read the file header and schema. I have a simple constant in Data Class (magiccookie). I check if the magic cookie is equals to my constant. If not I throw as U said my exception, let's say "NotADatabaseException". That's it. Vlad
I check the value and issue a warning if it's not what I expect. The program can still continue. Since the documentation never says that there's only only possible value, I'm not comforable making that assumption. (Though it does seem likely.) If the user has in invalid file and they chose to ignore the warning, well, they'll just have to deal with getting a bunch of meainingless gibberish on the screen.
"I'm not back." - Bill Harding, Twister
Joined: Jun 02, 2003
Hi Jim, Interesting approach ! I just disagree with you when you write :
Since the documentation never says that there's only only possible value
. When they write in the specs that that value
Identifies this as a data file
its uniqueness is implicit, at least for a well-known file format. Best, Phil.
Joined: Jan 30, 2000
Probably, yes, but I'm just not certain. They could have one cookie value for Contractor files, and another for Hotel files - the number still identifies a data file, just a specific type of data file. This scenario is pretty unlikely, I agree. But there's been no actual request to reject invalid files. So to me, there's less damage done by running with an invalid file than there is by refusing to run for a valid file. In the former case, you can detect the error and correct it without recompiling; in the latter case, you must recompile. Unless you make "expected cookie value list" a configurable property - but that's way too much work for something they didn't document properly. Issuing a waring or exiting the program are the only two options that seem worth taking the time to implement, IMO.
Joined: Jul 07, 2003
Hi, Well, first we cannot and should not develop the most universal programm that cann handle any data formats. Let's assume I let the user work with the false format. Your programm can destroy the file, which was propably your file with whole your passwords, which you forget to make write-protected.... My point is : The programm should accept only that format what it can interpret. So, again, I think that solution 2 from Phillipe is perfect. Vlad
Joined: Jan 30, 2000
Well, first we cannot and should not develop the most universal programm that cann handle any data formats. Nope. I'm just handling what they actually bothered to document. Let's assume I let the user work with the false format. Your programm can destroy the file, which was propably your file with whole your passwords, which you forget to make write-protected.... Well if the user is stupid enough, there's not much we can do to save him. He can destroy a file with notepad or vi after all. Using my program though, he's first have to ignore the warning about a probable invalid file format, then he's have to search and display at least one record, which will appear as gibberish. Then he'd have to ignore the fact it's gibberish and try to book the record anyway (which will only work if the customerID field happens to be blank). At that point, if the user discovers he's destroyed his password file, he doesn't really deserve any sympathy, IMO. My point is : The programm should accept only that format what it can interpret. I would agree if I knew for sure whether the program could interpret a given file. I don't think we really know this though, so I try to be flexible. As a similar example - hava you ever used JAD, the Java decompiler? Very cool program. Unfortunately it hasn't been updated in some time, and it doesn't understand some details of the JDK 1.4 class structure. When you run it, it checks a file format version number at the beginning of each class file to see if it falls within the range of versions it's designed to handle. Which basically extends up through JDK 1.3. Now, I've had files that were generated by JDK 1.4 which I wanted to decompile. I suspected that even though they used 1.4, they probably didn't actually use the new features of 1.4 which had required format changes; in fact the formats are very similar in all but a few cases. So I wanted to get JAD to try to decompile the file anyway, to see if it would work. Unfortunately JAD absolutely refused to have anything to do with the file because it didn't like the version number. In order to work around this, I used a binary editor to manually alter the version number in the class file to basically lie and say it was generated by 1.3. Then I ran JAD and got perfect results - I was right, the version difference did not actually affect the file I was working on. So the whole business about refusing to work on the file because of the version number was just a big inconvenience for me, and I would have greatly preferred that JAD had just issued a warning rather than refusing to work. OK, that's for a "version number" rather than a "magic number" - the latter is more likely to be intended as a unique value for anything remotely resembling a data file using the documented file format. However "more likely" != "certainly". Really, either solution is acceptable to me here - I just have a preference for "warn" rather than "fail". If Bodgett & Scarper want different behavior, they need to learn to write better specs. The issue is documented in the design docs, and (if this were a real-world project) it would be trivial to change the behavior to "fail" later once someone verifies that there really is one and only one possible magic cookie value.
Hi Jim If you are going to issue a warning rather than failing, then there are a few extra tests you could do. For example, it should be possible to calculate how large the file should be. So if the user did decide to use their password file then the chances of the file size you calculate from the nonsensical header information actually matching the real file size is extremely small, and you can exit then. Regards, Andrew [ July 11, 2003: Message edited by: Andrew Monkhouse ]
Hmmm, good point. Well, in Contractor we can't actually predict the file size from the header - it depends on the number of records. In fact we infer the max record number from the file size. But it's at least true that the file size minus the header length should be an integer multiple of the record length. Also, the record length given in the header should be equal to the sum of the field lengths (plus 1 perhaps for the delete marker? I forget.) And all the delete markers should be either 0 or 1... Come to think of it, the way I've got it written, each client checks the metadata for the column names it expects, and if it can't find "name" and "location" at least, the client will error out. I'm inclined to say that's probably a sufficient check for my purposes - how likely is it that some other file would have those two strings in the right place to look like column names? Maybe I'll keep the length check too - it's possible a an otherwise valid file has stuff accidentally appended or deleted from the end, and the length check has a good chance of detecting that. Cool. The nice thing about these checks it they're things I need to look at anyway - it's just an extra couple lines to throw an error if the result makes no sense. Thanks for pointing that out, Andrew.