aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes NX:  Primary Key and Immutable Key Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "NX:  Primary Key and Immutable Key" Watch "NX:  Primary Key and Immutable Key" New topic
Author

NX: Primary Key and Immutable Key

Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
My design follows these conventions: I expose Data (as the interface
DBMain) to the client, and the client's business methods call lock() and
unlock() as appropriate.
I was reviewing and augmenting my JUnit tests of Data when I smelled
a bad smell concerning my preliminary understandings of what the
DuplicateKeyException implied. Before coming to my final conclusions,
I also wanted to review a post which I felt, based upon intuition, might
be related:
http://www.coderanch.com/t/184840/java-developer-SCJD/certification/NX-Questions-deleted-records
Here are my current ideas (though these ideas are not meant as a response
to the above link, the above link I investigated to be sure that I didn't leave
anything out; of course, if there are flaws with the following ideas, please
let me know).
Because I expose Data to the client, this means that I design and test the
database independent of my particular project (URLyBird). The database I
design and test, and the Data object given to the client would work for
any project (URLyBird, contractors, and projects that are not yet defined).
Although my specific implementation may vary, the general concepts concerning
keys are now outlined [of course, I present these ideas for two reasons: my
ideas may be useful to others, my ideas may be flawed and thus may be corrected.]
There are two concepts: PrimaryKey and ImmutableKey.
Each concept or object defines zero or more fields which define it. PrimaryKey and
ImmutableKey are equivalent in that whatever fields each defines is identically defined
in the other. [Note: I forgot to mention, however, that a PrimaryKey has a state: on
or off; if the PrimaryKey is off, then it is not enforced, and a DuplicateKeyException
is never raised within the create() method.]
Example: if PrimaryKey.toString() = fields 0 and 2, then ImmutableKey.toString() also
is fields 0 and 2. if PrimaryKey.toString() = no fields, then ImmutableKey.toString()
also = no fields.
Functions:
The PrimaryKey is used during the create() method.
The ImmutableKey is used during the update() method.
PrimaryKey
---------
If the PrimaryKey is not trivially defined (i.e., if it is not
defined to define no fields), then the job of the PrimaryKey
is as you would expect in standard relational databases:
when a record is created, before that record is created,
every extant record must be read to determine that the
PrimaryKey fields are not duplicated. A DuplicateKeyException
is thrown if the primary key (as defined by the concatenation
of the fields which comprise that primary key) already exists
within the database. This search might be optimized by refining
the reading methods to read only primary key fields.
While a PrimaryKey has no particular function in the URLyBird
project, it is easy to imagine a database containing names
and addresses, and you might define the name field as a primary
key which you would not want duplicated (this lessens the
confusion of the user if the user's Aunt is not listed five times
within the database).
ImmutableKey
------------
The ImmutableKey defines fields which cannot be modified during
an update operation. So, if a person books a hotel room that
contains 20 beds for a convention, the number of beds must be
immutable, otherwise the contract with the person who made the
booking could potentially be broken.
Interestingly enough, even though the URLyBird project has no
need for a PrimaryKey, it definitely needs an ImmutableKey which
is the hotel, location, and all the attributes of the room being rented.
That is, all the fields but the last (which contains the ID of the person
who booked the room).
Notice that when I say that an ImmutableKey is required, I don't mean
that you must implement an ImmutableKey object; I only mean that
your implementation must have the same effect as if you had an
ImmutableKey object as defined next.
Assuming that an ImmutableKey object is implemented, here are the
two possible implementation strategies: A and B.
A
-----
Rule: Your database is not allowed to make an inactive record become
active in place. That is, if record 100 is deleted (it is inactive), you
can never make it active again (until the server shuts down, at which time
you can compact the database). The reason for this is that this algorithm
ignores immutable fields, so if the immutable fields changed, the algorithm
would be updating a completely different record, and it would have no way
of knowing this.
It is my understanding that one is not required to make an inactive record
active in place: Creates a new record in the database (possibly reusing a
deleted entry).
Though, perhaps I am mis-reading this Sun directive?
Perhaps it is saying "if you can re-use a deleted entry, then do so?"
This algorithm ingores all incoming fields of the incoming record which
are mutable. Any incoming field which is mutable could potentially be
passed in as null or as an empty string. The algorithm only focuses
on the incoming fields which are mutable, and it writes out only and all the
mutable fields to the record it is updating (assuming that the mutable
fields are not null, empty, and the like, otherwise this would cause
an IllegalArgumentException).
In short: this algorithm does not read the record it is updating; it only
writes the mutable fields to this record it is updating.
B
-----
This algorithm allows the create() method to in place make a deleted
(inactive) record become active (i.e., write a new record in the same location
as a previous, deleted record once existed).
This algorithm must read the record before updating it. This is because the
fields of the record are no longer strictly immutable if you allow a previously
deleted record to be made active in place.
This algorithm only reads the immutable fields of the record and compares them
to the corresponding fields of the incoming record. If they match in a manner
that you consider to be equivalent, then and only then is it appropriate to
continue, otherwise,
1. If the database is designed so that in place recreation of records is
allowed, throw a RecordNotFoundException (since the record you
were looking for, characterized by its immutable fields, no longer exists).
2. If the database is designed so that in place recreation of records is
not allowed, ...? (not sure where point 2 might lead).
Assuming that all the immutable fields comprising the ImmutableKey match with
the corresponding fields of the incoming record, then and only then is the
record updated. First attempts may simply write the complete record out anew;
more subtle, subsequent implementations might only write out the mutable
fields (to save time).
Thanks,
Javini Javono
P.S.
After writing the above, it came to me that, yes, I shut down my server all the time.
But, a production server, perhaps handling clients around the globe is running all
the time. Thus, I should design my server to attempt to stay running as long as
possible, and attempt to in place create new records where previously
deleted records existed. Thus, given that I will carry out this strategy, I will be
compelled to follow plan B as outlined above.
[ March 15, 2004: Message edited by: Javini Javono ]
Don Wood
Ranch Hand

Joined: Dec 05, 2003
Posts: 65
Hi Javini,
If I understand what you are describing in plan B, then I do not seem to agree with all of it.

This algorithm only reads the immutable fields of the record and compares them to the corresponding fields of the incoming record. If they match in a manner that you consider to be equivalent, then and only then is it appropriate to continue, otherwise, 1. If the database is designed so that in place recreation of records is allowed, throw a RecordNotFoundException (since the record you were looking for, characterized by its immutable fields, no longer exists).

If the record is deleted, you do not care whether the rest of the fields are different from what you expected. The record can be overwritten. Throwing a RecordNotFoundException for an unexpectedly modified deleted record does not make sense. If the record has been overwritten by another thread so it is no longer deleted, you still do not want to throw an exception. Instead, you must look for another place to write the new record (either find another deleted record or add to the end of the file).
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
My remarks may be right, may be wrong, or may be ambiguously stated.
(of course, I do not make wrong and ambiguous statements on purpose,
but there is no doubt that I make wrong and ambiguous statements).
I appreciate you attempting to find errors in my logic. The reason I have
not gone over my statements at this time is that I'm busy doing extensive
JUnit testing; and, these tests, written in Java, and not in the ambiguous
world of English, are what I rely on.
I will bookmark this link, and return to it when I have more time. Then I
will attempt to tranform your observation of a potential fault in my
design logic into a JUnit test.
The above presented two concepts which I have found quite useful (though
things have changed once I got to the implementation details).
The two major concepts I outlined above, I have subsequently found important
in deactivating subtle, insiduous, land mines.
Thanks,
Javini Javono
[ March 17, 2004: Message edited by: Javini Javono ]
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Originally posted by Don Wood:

Hi Javini,
If I understand what you are describing in plan B, then I do not seem to agree with all of it.

If the record is deleted, you do not care whether the rest of the fields are different from what
you expected. The record can be overwritten. Throwing a RecordNotFoundException for an
unexpectedly modified deleted record does not make sense. If the record has been overwritten
by another thread so it is no longer deleted, you still do not want to throw an exception. Instead,
you must look for another place to write the new record (either find another deleted record or
add to the end of the file).


Hi,
As I said before, thanks for taking the time to read my post and respond to it.
I have bookmarked this thread, and I have further written in bold to review all
my policies.
In general, I think that we are responsible for making up the rules, and the rules
we define must make sense.
There are two universes where we make rules:
Universe1: the raw database as a concept standing by itself.
Universe2: the business logic, which uses the rules of the raw database. The business
logic has its own rules, so that when these rules are applied to manipulating the
raw database, everything is logically consistent.
So, let's first investigate the raw database rules.
1. In order for a record to be deleted, it must be locked first.
Now, the database can have two different rules for creating new records:
1. It may over-write a previously deleted record, or
2. It never over-writes a previously deleted record, but always writes
new records at the end of the file.
I am not sure about this, but we may have a choice as to which policies
we will allow in our database.
At this current time, I allow a new record to be created such that it
can exist in the same physical location as a previously deleted record.
It is possible, that allowing a new record to be written where a previously
deleted record existed is a logical flaw in and of itself. As I continue to
work, I will consider this as a possibility.
Let it be assumed that using the raw database universe has no rules of
usage per say. That is, the only rules are enforced by the methods
throwing exceptions or by the methods carrying out the command.
1. Before a record is deleted, it must be locked.
This rule is straightforward, in that when a record is mutated, only one client
can mutate it at a time.
2. Before a record is deleted, you must prove that the immutable key of that
record is known to you.
This rule, which may not cover all logical holes, does cover some. For instance,
1. Record 100 contains the record with the immutable key "ABCDG".
2. There exist no macro usage rules for the database, the database either carries
out the command or it throws an exception. Thus, there are no rules which state
that you must first lock record 100, read it, verify it, only then delete it, and then
unlock it.
3. Therefore, ClientA may delete record 100, then create a new record, which the
system, by chance, places back into record location 100, but this new record has
a different immutable key of "ABCDH". ClientB, having previously read record 100
and believing it to have an immutable key of "ABCDG", now does a blind delete
of record 100: delete(100), and ClientB has just deleted the wrong record.
4. In my low-level, database universe of rules, doing a blind delete on a record
within a database which allows newly created records to overwrite previously deleted
records throws an exception. Instead, under these circumstances, ClientB must
you a verified delete method thus: delete(100, fields), and only if the immutable
fields of the argument match the immutable fields of record 100 is the record deleted.
In short, and having discovered this new terminology in thinking about your question,
my low-level pure database rules are not multi-step in nature, they are single-step rules.
You either can or you cannot do some particular action. And, when you do this particular
action, the record either must be locked or is allowed not to be locked.
The danger is that these single-step rules are not full-proof; and, if they are not, or
if I have any doubt about them being fool-proof, then I will have to define a new rule
like this: newly created records are always written to new locations at the end of the
file.
As another example of a low-level database rule: When a record is updated, the fields
passed in as an argument must have the identical mutable key as the record that is being
updated. As you can see, this is similar to the verified delete method.
I'm not saying that what I have written is the standard treatment done by others
(I'm not even sure if this has been written about previously in this way). And,
I'm not even sure that allowing new records to over-write previously deleted
records is 100% full-proof.
But, I hope I have addressed your question to some extent. And, again, I appreciate
you taking the time to read my post and respond with your ideas.
Finally, concerning the creation of new records. If the policy that allows a new
record to over-write a previously deleted record is active, then the database
does not care at all about the contents of a deleted record; if the record is
deleted, then it is locked, and then it is written over (and its contents are
not important). If no re-usable records can be found (that is, the database
contains no deleted records), then the new record is created at the end of
the file.
Thanks,
Javini Javono
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
Here are further notes expanding on the notion of whether or not allowing
a newly created record to over-write a previously deleted record, given how
my database universe is defined above, makes sense.
It's conceivable that it does not make sense. It's conceivable that it would
only make 100% sense if there was another field which contained a uniquely
generated integer value. Whenever a new record is created, one field is
dedicated to containing this uniquely generated integer value, and every
record in the database, past and present, contains this field containing
this uniquely generated number.
Thus, for a record containing an immutable field, represented conceptually
as "ASDFG", part of its immutable field would be a "hidden" unique integer,
thus, the immutable field would really be:
"00001-ASDFG".
When someone deleted this record, and then a new record was written over
it with the same immutable field, conceptual given as "ASDFG", the true
immutable field in this new record would now be:
"00002-ASDFG"
In this way, every record is always uniquely defined by its immutable field.
However, our database does not have an extra field to place this unique
integer value. Thus, to allow a new record to over-write a previously
deleted record could, perhaps, be illogical given that my low-level
database universe does not define conceptual rules, only operations.
So, here is what I will probably do. I will state that allowing new records
to over-write previously existing records may not be logically sound, and
that I don't know the philosophical or logical answer to this question.
It is the customer's responsibility to configure the database in a safe
manner, and the customer does this by setting up the server and stating
what the policy will be by chosing one of the following choices from the
preferences panel:
1. The database will allow the creation of new records to over-write
previously deleted records.
2. The database will only write new records at the end of the file.
The customer will be reminded that the database contains micro-operations,
and does not involve contextual rules (besides that of insisting that for
some operations the record be locked). And that it is probably safer,
to set up the database so that newly created records are always created
at the end of the file.
Thanks,
Javini Javono
Don Wood
Ranch Hand

Joined: Dec 05, 2003
Posts: 65
Hi Javini,
You said:

Finally, concerning the creation of new records. If the policy that allows a new record to over-write a previously deleted record is active, then the database does not care at all about the contents of a deleted record; if the record is deleted, then it is locked, and then it is written over (and its contents are not important). If no re-usable records can be found (that is, the database contains no deleted records), then the new record is created at the end of the file.

This is the point I was making so it seems we agree.

So, here is what I will probably do. I will state that allowing new records
to over-write previously existing records may not be logically sound, and
that I don't know the philosophical or logical answer to this question. It is the customer's responsibility to configure the database in a safe
manner, and the customer does this by setting up the server and stating
what the policy will be by chosing one of the following choices from the
preferences panel:
1. The database will allow the creation of new records to over-write
previously deleted records.
2. The database will only write new records at the end of the file.

I'm not sure how a reviewer will take this. It sounds like you have implemented a database that you are not sure is safe. You are then asking a customer to make a decision about safety when the customer does not know anything about the implementation of the database. He certainly does not know as much about the implementation as the developer knows.
We are asked to make decisions as part of this process. As presented, this seems to be a decision not to decide but instead put the responsibility on the customer.
But there is another interesting point here. I'm guessing that the unstated philosphical issue is whether or not data that is deleted should actually be destroyed. If you don't write over it then it could be mined/restored in some useful form in the future. I think you can present the configuration option as a way to recover historical information. The reviewer will take a much kinder view of that than if it is presented in terms of safety.
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
I am actually saying that I don't know the following:
I don't know with certainty that it is logically sound to allow the database
to over-write previously deleted records given the way I have defined
by database world: i.e., there exist no contextual rules, only operations,
and that some operations require that the record be locked for the
operation to be carried out.
When the database policy allows newly created records to be written
over previously deleted records, I have closed one possible logical
error by insisting that any delete must be a verified delete:

Only if the database forbits creates to overwrite, wherein newly
created records are always created at the end of the file do I
allow a blind delete:

Furthermore, I'm saying that if creates are allowed to over-write
previously deleted records, I am not, at this time, 100% certain
that this policy is logically sound. That is, there may be some
loop-hole in the logic which I have not seen yet.
Thanks,
Javini Javono
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
Here is the potential logical loop-hole or failure that might be lurking
in my operationally-driver database design:
Under ideal conditions, every new record would have a field containing an
integer value which is unique. Thus, if the immutable field of one record
were identical to the immutable field of another record as far as logical
content goes, the actual immutable key would be:
Record 100 has this immutable field:
"0001-ASDFG" or in plane English, "0001-ZapHotel-ZippyTown-...et cetera."
Record 200 has this immutable field:
"0002-ASDFG" or in plain English, "0002-ZapHotel-ZippyTown-...et cetera."
The immutable fields are always unique, even though the hotel, town,
and other characteristics of the room for rent are all identical.
Under this circumstance, there would be no logical holes in my design.
But, given the database as Sun has created it, there is no unique integer
field, and thus there could exist logical holes in my design which I need
to investigate further.
Again, all this comes about based upon my premise: the database operations
are singleton operations, there are no contextual usage rules (except that
for some operations a record may need to be locked).
So, here is an example, and the question is whether it creates a logical flaw
in the database design or whether it is even important:
Client2 reads record 100 containing the immutable field "ZapHotel-ZippyTown".
Client1 reads the same record 100 containing the immutable field "ZapHotel-ZippyTown".
Client2 does a verified delete of record 100.
Client2 creates a new record with the immutable field "ZapHotel-ZippyTown" which
is, by chance, created at record location 100.
Client1 does a verified delete of record 100.
Technically, Client1 deleted a different record, which would have been known about
if I had that extra integer field for a unique value. The question is, does it matter?
Does it create any logical inconsistencies in the way the database is used?
I'm inclined to think at this time, that there is no difference, and thus there is
no logical inconsistency I have to worry about.
Regardless, I will document this as the only known logical danger of allowing
newly created records to over-write previously deleted records. The user will
have to decide whether this is important or not.
Of course, when I get to my business methods, I may decide that I cannot create
a logically consistent solution unless I set the policy on newly created records
one way or the other. But, it is also quite possible that my business methods,
which will impose contextual rules, will be happy working under either scenario.
In conclusion, I will make sure that the combination of the contextual rules of
my business methods along with the single-step operations of the database
make sense. But, when the examiner tests the database in stand-alone mode,
doing whatever pleases the examiner, my documenation of the single-step
database operations needs to explain the implications of the record creation
policy (as given in the above example).
In short, the database is a simple tool, containing single-step database operations--
and this is its prime strength as there exist no contextual rules (such as those found
in business methods, wherein you lock, carry out a series of operations, and then
unlock).
However, anyone using the database, including my business methods, and including
the test driver written by the examiner, need to be aware of what can happen
when a policy is chosen which allows newly created records to over-write previously
deleted records.
Thanks,
Javini Javono
[ March 27, 2004: Message edited by: Javini Javono ]
Javini Javono
Ranch Hand

Joined: Dec 03, 2003
Posts: 286
Hi,
The update() method is also effected by the creation policy:
1. newly created records over-write previously deleted records, or
2. newly created records are always appended at the end of the file.
For policy (1), the update() method, as I will have to document it, is
required to send it any field which is immutable, so that a
verified update operation can take place (just as the verified delete).
For policy (2), the update() method need not pass in any field or fields
but those that are being updated. Of course, an immutable field can
never be updated. This is called a blind update and is similar in its
idea as the blind delete.
Thanks,
Javini Javono
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: NX: Primary Key and Immutable Key