aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes NX: URLyBird Database Record Number Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "NX: URLyBird Database Record Number Question" Watch "NX: URLyBird Database Record Number Question" New topic
Author

NX: URLyBird Database Record Number Question

Kerry Friesen
Greenhorn

Joined: Oct 10, 2003
Posts: 23
Greetings,
I'm looking for ideas on how to implement record numbers since the database file does not have a field for them. I can see using the key of a key-value pair in the case of an in-memory (hashtable) representation of the database. If no caching is used, what happens then?
Cheers,
Kerry


Kerry Friesen<br />SCJP 1.2<br />SCJD
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Kerry,
As our records have a fixed length, you may use the record number to compute records positions (offsets) in the file before accessing them.
Does it reply your question ?
Best,
Phil.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11479
    
  94

Hi Kerry,
I think your question was the opposite of Phil's answer: you are trying to find out what the record number is.
In which case the record number could be the offset into the file. If you have read 4 records (whether they are deleted or not), then the record number of the next record must be record number 5!
Now that you know what the record number is, Phil's explanation comes into play. If a client asks you to read record number 5, you know that you can go to the start of the file, skip the header and schema, then skip 4 records, and you must be at record number 5.
Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Andrew,
I think your question was the opposite of Phil's answer: you are trying to find out what the record number is.

Maybe you're right, but IMO, at the db level, we explained exactly the same stuff.
In pseudocode, our explanations give this :

I didn't think of the opposite perspective, because practically clients will get recNos as a result of the find() method.
Best,
Phil.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11479
    
  94

Hi Phil,
Originally posted by Philippe Maquet:
Maybe you're right, but IMO, at the db level, we explained exactly the same stuff.

Definately.
But I do need something to argue with you about since I seem to be loosing in that other thread
Best regards, Andrew
Kerry Friesen
Greenhorn

Joined: Oct 10, 2003
Posts: 23
Andrew and Phil,
Thank-you both for your input. The combination of the two posts answered my question!
Have a great day!
Regards,
Kerry
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Andrew,
But I do need something to argue with you about since I seem to be loosing in that other thread

I may understand that.
Now if you are reluctant to transgress the big "Avoid Public Access to Instance Variables" taboo, be reassured, you'll soon have to argue against me !
Best,
Phil.
[ October 29, 2003: Message edited by: Philippe Maquet ]
Richard Jackson
Ranch Hand

Joined: Jun 25, 2003
Posts: 128
Hi,Andrew
Would you mind that I enjoy your discussion?

If a client asks you to read record number 5, you know that you can go to the start of the file, skip the header and schema, then skip 4 records, and you must be at record number 5.

Yes,I agree to your point.
When we wanna read record,we should skip required bytes ,and I image the code like this

But I still have a question according this following,

In the method,does the argument 'recNo' imply that it's 'record number' or 'record count'?
Regards,
Richard


Regards, Richard
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11479
    
  94

Hi Richard,
Originally posted by Richard Jackson:
Would you mind that I enjoy your discussion?

You are very welcome to join any discussion.
Originally posted by Richard Jackson:
When we wanna read record,we should skip required bytes ,and I image the code like [that shown above]

Yep - that looks right.
Originally posted by Richard Jackson:
But I still have a question according this following,

In the method,does the argument 'recNo' imply that it's 'record number' or 'record count'?

'recNo' is the 'record number' - the number of the record that the client wants to read.
There are no requirements to allow the client to read a count or range of records, they only read one record at a time.
If you use the Java / C convention that numbering starts at zero, then your find() method will return '0' as the record number for the first record in the database, and '9' for the 10th record in the database. If the client calls read() with a recNo = 9, then you will skip 9 records (using something similar to your psuedocode above) then return the next record read.
Regards, Andrew
Richard Jackson
Ranch Hand

Joined: Jun 25, 2003
Posts: 128
Hi,Andrew
Thanks for your reply.
I try to read each record according to the schema section.And I accept your
advice in it.Code show as below,

Is it right?I can't ensure that.
When I want to write records,what else differences from this?
Please every one comment on the piece of code.
Regards,
Richard
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11479
    
  94

Hi Richard,

The basic concept is correct.
Just some general comments though:
  • Rather than using "magic" numbers (numbers that work, but mean nothing to a casual reader of your code), you would be better to have constants in your class with the same values. So you could have "COOKIE_SIZE" instead of the magic number 4 for example.
  • Alternatively, instead of calculating the header and schema size, you could just read the header and schema (you have to do that anyway), then use RandomAccessFile's getFilePointer() method to tell you where you currently are in the file.
  • Instead of doing a seek() to the start of the file, then skipBytes to the start of the data, why not seek() directly to the start of the data?
  • Strictly speaking, the seek is not necessary if you are planning on reading the entire file - it is more valuable later when you want to read a specific record. In which case you probably want to only perform one skip directly to the required record, rather than multiple little skips.


  • For the next section:

    You need to read the deleted flag before you start reading fields. There is only one deleted flag, not one per field.
    This means that your handling of the deleted flag status must move as well. You want to skip reading the individual fields if the entire record has been deleted.
    You are missing code in the case where the record has been deleted - you will have to skip over the current record.
    Actually I just realised - the code within your loop will read entire records, however the loop itself will only work for the number of fields in the record. So your counters inside the for() statement are incorrect.
    Regards, Andrew
    Richard Jackson
    Ranch Hand

    Joined: Jun 25, 2003
    Posts: 128
    Hi,Andrew
    Thanks for your sincere advices.
    Following your above points,
    you would be better to have constants in your class with the same values.

    I agree.
    So I try it again after changing that to constants.

    Is it right about its integrity?
    you could just read the header and schema (you have to do that anyway), then use RandomAccessFile's getFilePointer() method to tell you where you currently are in the file.

    I agree.
    Instead of doing a seek() to the start of the file, then skipBytes to the start of the data, why not seek() directly to the start of the data?

    You mean that I should seek the header of current record directly?
    But I am confused this.Should I give up using 'seek()' method?
    In which case you probably want to only perform one skip directly to the required record, rather than multiple little skips.

    Yes,you are right.
    Actually you mean that we need skip bytes with changeable length so as to read any record.OK?
    You need to read the deleted flag before you start reading fields. There is only one deleted flag, not one per field.
    This means that your handling of the deleted flag status must move as well. You want to skip reading the individual fields if the entire record has been deleted.
    You are missing code in the case where the record has been deleted - you will have to skip over the current record.
    Actually I just realised - the code within your loop will read entire records, however the loop itself will only work for the number of fields in the record. So your counters inside the for() statement are incorrect.

    In fact,my counters is NOT correct.
    How can I do to decide the loop counter?
    Each record contains some bytes,thus many people use ByteArray or CharArray .Are they required?
    Regards,
    Richard
    Andrew Monkhouse
    author and jackaroo
    Marshal Commander

    Joined: Mar 28, 2003
    Posts: 11479
        
      94

    Hi Richard,
    Andrew:
    you could just read the header and schema (you have to do that anyway), then use RandomAccessFile's getFilePointer() method to tell you where you currently are in the file.
    Richard:
    I agree.

    If you are agreeing with that, then why are you changing your code above (the one with the "MAGIC_COOKIE+NUM_OF_FIELDS..." in it)?
    Andrew:
    Instead of doing a seek() to the start of the file, then skipBytes to the start of the data, why not seek() directly to the start of the data?
    Richard:
    You mean that I should seek the header of current record directly?
    But I am confused this.Should I give up using 'seek()' method?

    What I am suggesting is that
  • instead of:
    seek(start of file)

  • skip(header)
    skip(schema)
    skip(record 1)
    ...
    skip(record n)
  • You could have:
    calculate header + schema + (record size * number of records)

  • seek(calculated bytes)

    RichardActually you mean that we need skip bytes with changeable length so as to read any record.OK?

    Correct.
    Richard
    In fact,my counters is NOT correct.
    How can I do to decide the loop counter?

    You have two options here.
  • Calculate how many records there should be in the file, and loop over them. You should be able to work this out since you know how big each part of the file should be and you can work out how big the file is.
  • Don't worry about when to stop the loop - just keep going until you get an exception because of the end of file


  • Richard:
    Each record contains some bytes,thus many people use ByteArray or CharArray .Are they required?

    I don't think so.
    Regards, Andrew
    Richard Jackson
    Ranch Hand

    Joined: Jun 25, 2003
    Posts: 128
    Thank you,Andrew
    You could have:
    calculate header + schema + (record size * number of records)
    seek(calculated bytes)

    As you said,I can work out thecalculated bytes.I believe I can skip to the required location.
    But I can do what for read each record with String[]?
    If I write code like this,

    How to do to get String array of record?
    Regards,
    Richard
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Richard,
    I still think that you should review the Java basics before going on with this assignment. But as I regret my last post to you one month ago (I've been a bit rude with you and I am so sorry about it), and as Andrew should be asleep at the time I read you (Andrew is in Sidney at +16 hours in comparison with JavaRanch time while I am "just" +8), I'll reply to this question myself :
    How to do to get String array of record?

    First of all, think of this :
  • You read a whole record as an array of bytes (It's OK, and that's the input of the next point).
  • You need to get an array of String field values.


  • After having read the header part of the file, you know the number of fields and how many bytes each field uses (the field lengths).
    Taking this information into account, you can :
  • allocate a String[] array : String[] fieldValues = new String[numberOfFields];
  • assign to each element of this array a String that you get by converting all bytes field values in String field values as you do above for the whole record.


  • Mmh... While writing that, you let me think to the optimization which could be brought by the fact that you do the bytes2String conversion only once by record ... It would bring design pros an cons anyway.
    So I will let you think of it by yourself ...
    Best,
    Phil.
    Paul F. Williams
    Greenhorn

    Joined: Nov 04, 2003
    Posts: 6
    Given the use of file offset as a record number, consider the following scenario:
    1) Client A retrieves record 1
    2) Client B retrieves record 1
    3) Client A deletes record 1
    4) Client A inserts new record, which goes in the record 1 position
    5) Client B updates record 1
    On step 5, if someone else deleted a record, I shouldn't be allowed to update that record; the database should notify me that the record was deleted (RecordNotFoundException).
    I know the URLyBird specification does not require a feature to delete or insert records from a client. Nevertheless, the above scenario is something to consider.


    Paul F. Williams<br />SCJP July 2003
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Paul - does you assignment require some sort of lock() method? At what point in the above scenario would this be called? Is there a way you can write the client code to use locking to ensure that the record it updates is the same one it has read?


    "I'm not back." - Bill Harding, Twister
    Paul F. Williams
    Greenhorn

    Joined: Nov 04, 2003
    Posts: 6
    Originally posted by Jim Yingst:
    Paul - does you assignment require some sort of lock() method? At what point in the above scenario would this be called? Is there a way you can write the client code to use locking to ensure that the record it updates is the same one it has read?

    I'm doing the URLyBird assignment; it does require locking. In this assignment, all database operations revolve around record numbers, but the assignent does not define what a record number is. If you define a record number as being the offset in the file, you have to beware of multiple clients accessing the same record at the same time.
    One important design decision is when to lock the records. If you use pessimistic locking, you lock the record early so nobody can write to it. If you vote for optimistic locking, you lock the record at the last possible minute.
    I decided that the records should be locked only when absolutely necessary. In other words, my clients can read any record without a lock, but to update, they do a tight lock/update/unlock transaction.
    Including locking, the scenario I described above becomes:
    1) Client A reads record 1
    2) Client B reads record 1
    3) Client A locks record 1
    4) Client A deletes record 1
    5) Client A inserts new record which goes into slot 1
    6) Client B locks record 1
    7) Client B updates record 1
    8) Client B unlocks record 1
    At step 6), Client B thinks it has updated the old record, but in reality, that record was deleted. If you use the file offset as record number, the database can't tell the difference between the old record and the new record.
    I decided that clients should not be able to lock records that were deleted. Therefore, my record numbers had to be unique and unrelated to the file position.
    I'm not saying my decisions are the best. By all means, choose your own design and defend it. I want to get people thinking.
    Andrew Monkhouse
    author and jackaroo
    Marshal Commander

    Joined: Mar 28, 2003
    Posts: 11479
        
      94

    Hi Paul,
    Some things to consider.
  • many of the specifications do not require you to reuse the space of the deleted records - the wording in the documentation for the create method is often "possibly reusing a deleted entry"
  • you could require all updates to provide all fields for the record being updated (bit of a waste of bandwidth though) - you could then restrict the number of fields which may be updated at any one time (e.g. no more than one or two fields may be updated in any single update), which then allows you to validate whether the current record on file matches the record the customer just sent for updates.
    You would have to check whether your documentation for the update() method can allow you to make that requirement (must send all fields) and allows you to throw an exception if the record has changed
  • just document the mess.


  • Regards, Andrew
    [ November 04, 2003: Message edited by: Andrew Monkhouse ]
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: NX: URLyBird Database Record Number Question