wood burning stoves*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes NX: UrlyBird data file format question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "NX: UrlyBird data file format question" Watch "NX: UrlyBird data file format question" New topic
Author

NX: UrlyBird data file format question

Chee-Chan Keng
Greenhorn

Joined: Mar 01, 2004
Posts: 24
Hi,
When I read the hotel records in the data file, I see that I can use the offset value in the header to skip to the first record, thus bypass all the header information (eg: 2 byte field length, n bytes field name, 2 bytes field length in bytes etc) since we have to follow the database schema anyway.
So is it right to skip the schema descripton and start reading the first record, or we have to "act dumb" and read through the schema information, then dynamically "new" each field according to the schema information?

thanks!
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11404
    
  81

Hi Chee-Chan,
Welcome to JavaRanch.
You will have to read the schema at least once so that you know how long the fields are within each record. But in subsequent reads you can skip the header (and indeed any unwanted records).
Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
I think Chee-Chan might be asking why he shouldn't hard code the field lengths because they are given in the assignment instructions, rather than read them out of the database file. If that really is his question, then I guess there are two things I can say:
1) If you take a very narrow interpretation of the requirements, then I guess you could rely on the database schema description that's given in the assignment instructions and ignore the database schema data in the database file.
2) If you did so, I think you'd making your application unnecessarily brittle. Remember that you are coding to this format because it supports some legacy application over which you don't really have any control. So, what happens if the developers of the legacy application decide to start storing a new field, say "rating", in the database. So the developers of the legacy application change the contents of the database schema data to reflect the addition of the new field. When the user attempts to use this new database format with your application he will find that your application no longer correctly reads the database file. In fact, assuming you don't get some sort of fatal exception first, you'll see all sorts of garbled data when you display the contents of the database in your application.
This didn't have to happen. If you had been processing the database schema data section of the database file, your application would continue to work with the new database file format. Presumably the new "rating" field isn't of any interest to you so you may or may not display it in the GUI, but at least your application will be able to go on working despite the change to the database file. In other words, your application is robust.
How can you know that this could possibly happen? The mere fact that there is a database schema data section in the database file itself is an indication that something like this might happen in the future. If the database schema data section weren't present in the database file then this type of thing wouldn't have to be a concern. The fact that it is there leads me to believe that something like this might happen. It's a bit of defensive programming and it may not be required by the assignment, but on the other hand it doesn't take a lot of code to process the database schema data section. Like everything in the SCJD the choice is yours, but be aware of the possibilities and consequences before you make your decision.


Regards, George
SCJP, SCJD, SCWCD, SCBCD
Chee-Chan Keng
Greenhorn

Joined: Mar 01, 2004
Posts: 24
Hi George,
Yes, that's my question. So instead of using the following code:
byte[] name = new byte[64];
byte[] location = new byte[64];
...,
Are you suggestion I should use something like:
List[0].name = fieldName[0]; // which is String "name"
List[1].name = fieldName[1]; // which is String "location"
List[0].data = new byte[fieldLength[0]];
List[1].data = new byte[fieldLength[1]];
...
So in this case, we basically don't really need the following section in the requirement document, since the scema in the data file tells all the story?
Hotel Name name 64 description
City location 64 description
...
Am I right with my assumptions?

thanks,
newbie moose :roll:
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Chee-Chan,
Yes, you're right.
Some people have a (very) difficult time understanding how the database file is structured (maybe because they've never had to do much file processing). I think the database schema section in the assignment instructions is meant to help make it easier to understand the database file. Maybe it doesn't help, but rather causes confusion. I don't know, but I'd like to think it was meant to help. It does gives some additional information like descriptive names for the fields and detailed descriptions of the fields, but as you say it's largely redundant since the same information is present in the database file itself. Anyway, given how documentation always seems to lag reality, I would trust the contents of the database file data schema section before I would the assignment instructions (if they were in conflict).
Seid Myadiyev
Ranch Hand

Joined: Jul 02, 2002
Posts: 196
Hello George and everyone,
What is a better way to deal with the record status constants: valid and deleted? I know it would be acceptable to make them class constants because they are explicitly defined in the specification, but what would really be a best way? For example, read their value from the properties file, or make them final variables giving them flexibility to vary on each run (without the need to recompile). Or some other way?
Thank you in advance!
Seid
[ March 09, 2004: Message edited by: Seid Myadiyev ]

Seid Myadiyev<br />SCJP, SCWCD, SCBCD, SCEA-Part 1
alzamabar
Ranch Hand

Joined: Jul 24, 2002
Posts: 379
Originally posted by George Marinkovich:

2) If you did so, I think you'd making your application unnecessarily brittle. Remember that you are coding to this format because it supports some legacy application over which you don't really have any control. So, what happens if the developers of the legacy application decide to start storing a new field, say "rating", in the database. So the developers of the legacy application change the contents of the database schema data to reflect the addition of the new field. When the user attempts to use this new database format with your application he will find that your application no longer correctly reads the database file. In fact, assuming you don't get some sort of fatal exception first, you'll see all sorts of garbled data when you display the contents of the database in your application.
This didn't have to happen. If you had been processing the database schema data section of the database file, your application would continue to work with the new database file format. Presumably the new "rating" field isn't of any interest to you so you may or may not display it in the GUI, but at least your application will be able to go on working despite the change to the database file. In other words, your application is robust.

On the other hand, one rule of thumb in distributed application is:
Interfaces should change seldomly, and when it happens, they should only add methods. Data Object shouldn't change at all.
The reason is stability. If you change the db, chances will be that you'll need to change also the Data Objects. You've got two ways of doing it:
1) By having two different data objects (the new one will reside on the server)
2) By Recompiling, retesting and redeploying all clients.
Practically you'll end up using the first solution (being careful about serialization compatibility between the two versions). The second solution would be difficult, as we know that clients can be in the most different places and sometimes they are difficult to locate. Some applications take off the wire the Data Object, use it to rewrite an object that client will use. This is very useful, especially in case of legacy systems.
Marco
[ March 09, 2004: Message edited by: Marco Tedone ]

Marco Tedone<br />SCJP1.4,SCJP5,SCBCD,SCWCD
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Hi Seid,
Originally posted by Seid Myadiyev:
What is a better way to deal with the record status constants: valid and deleted? I know it would be acceptable to make them class constants because they are explicitly defined in the specification, but what would really be a best way? For example, read their value from the properties file, or make them final variables giving them flexibility to vary on each run (without the need to recompile). Or some other way?

Well, I made them constants (private final static int) in the Data class.
It would be more flexible to make them configurable from the suncertify.properties file as they then could be changed without the need to recompile.
I believe you'll be OK either way you go. It's a judgment on your part on how likely these values are to change in the future. It's also a judgment about how much extra work is justified to support this additional flexibility.
Akhshay Ray
Greenhorn

Joined: Nov 21, 2003
Posts: 11
Hello George,
Thanks for your explanation on why to read the data schema details from the data file instead of hardcoding it as specified in the specs. I have coded and tested both ways but for the past 2 months I was unable to convince myself as to why I would read the schema section from the data file.
Now if my DataSchema class reads its values from the file, then how can I make this class a Singleton?
Right now, my DataSchema class is a member variable of my Data class.
Thanks! Akshay
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Ii Akhshay,
Originally posted by Akhshay Ray:

Now if my DataSchema class reads its values from the file, then how can I make this class a Singleton?
Right now, my DataSchema class is a member variable of my Data class.

My understanding of your Data class is as follows:

If you want to make the DataSchema class implement the singleton design pattern then you could do the following:

If I've misunderstood your question and you want the Data class to implement the singleton design pattern rather than the DataSchema class, then you could apply the same changes I made to the DataSchema class to the Data class.
Akhshay Ray
Greenhorn

Joined: Nov 21, 2003
Posts: 11
George,
Thanks for replying. You understood my doubt correctly and I was doing exactly what you have explained above. But in this approach, I cannot get an instance of DataSchema in other classes.
For example, in some class someClass.java:

But then, it doesn't make any sense making the DataSchema Singleton.
So I was thinking of having two methods in DataSchema class:
public DataSchema createInstance(File file);
public DataSchema getInstance();
Now, is this too complex or violates the Singleton pattern in any way?
Thanks.
Satish Avadhanam
Ranch Hand

Joined: Aug 12, 2003
Posts: 697
Hi Akhshay
Originally posted by Akhshay Ray:
George,
For example, in some class someClass.java:

But then, it doesn't make any sense making the DataSchema Singleton.
So I was thinking of having two methods in DataSchema class:
public DataSchema createInstance(File file);
public DataSchema getInstance();
Now, is this too complex or violates the Singleton pattern in any way?
Thanks.

As long as you have created single DataSchema static object in the createInstance and returning the same in the getInstance, then you will have only one instance of DataSchema. Which means you are using single instance of class and I think that should'nt violate Singleton pattern.
Good Luck.
George Marinkovich
Ranch Hand

Joined: Apr 15, 2003
Posts: 619
Hi Akhshay,
Originally posted by Akhshay Ray:



Well, think about how you'll actually want to use the schema data. It might be useful to you to have the following static accessor methods in DataSchema:

This way you don't really have to get the instance of DataSchema, but only the data within DataSchema that interests you.

But then, it doesn't make any sense making the DataSchema Singleton.

I don't think that's necessarily true. If it made sense to make the DataSchema class a singleton before, I'm not sure this invalidates that reasoning.

So I was thinking of having two methods in DataSchema class:
public DataSchema createInstance(File file);
public DataSchema getInstance();
Now, is this too complex or violates the Singleton pattern in any way?

Well I don't think it's too complex. As for violating the Singleton design pattern I don't think it does. It's still not possible to create more than a single instance of DataSchema, is it? That's really the only guarantee that the Singleton design pattern makes. The rest is merely convention. So if you use the static accessors described above then you would only need:

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: NX: UrlyBird data file format question
 
Similar Threads
URLyBird: How do I read in Deleted Flag?
start position of data in db file in Bodgit and Scraper
How to decide header length from data file ?
Data File Format & Schema File
Understanding the DB file