aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes DuplicateKeyException revisit Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "DuplicateKeyException revisit" Watch "DuplicateKeyException revisit" New topic
Author

DuplicateKeyException revisit

Andy Zhu
Ranch Hand

Joined: May 26, 2004
Posts: 145
I have B&S assignment. In data access interface, create throws dupkey ex; update does not. More important, in the given meta data specified in the data format, no combination of fields forms a convincing primary key so that dupkey can be checked against.

My current strategy is not to check at all. In addition, some other threads suggested:

1. PK: name/location
2. PK: some other combination with explanation.
3. checks on cached deleted rec.

To use other methods, I have these considerations:
1. How do you enforce your rule in update?
2. How do you handle when you init your db? (image sun's assignment db is just a sample db from a real one which may be the one for testing and may have "dup key" based on your definition.)

In my solution, I feel it right:
1. it solves above conflicts.
2. Furthermore, in real db, dup key, taking as the phrase it means, is defined by the user of db; that is, the user defines PK to the db, then the db can handle it correspondently. in the internal of db, where we are in scjd, there should be some structure/field, besides of the regular fields, for this purpose. However, we don't have such a structure in the metadata. (Actually, what we are handling is both the dbms and database.) From this point, in order to be able to handle a real-world dupkey, the current metadata is insufficient and only implementing the interface is insufficient, either. Therefore, IMHO, the interface is the minimal we need to implement so that we can "ignore" the dupkey ex.

Well, I just don't know in the reality of scjd: is there any one who adopted this strategy and still made sun happy about it? If you have passed your exam, may you dig some on your mem?

Thanks


--------<br />Andy Zhu<br />scjp 1.4<br />scjd 1.4<br />SAS Certified Programmer 9.0
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17257
    
    6

I still assert that that interface is correct. That Create should throw a dup key exception, and update shouldn't. This is how real databases work, sort of. Usually you do not change the PK during an update, but an insert must have the PK set. If you set it to one that already exists in the database, then an exception should occur.

Mark


Perfect World Programming, LLC - Two Laptop Bag - Tube Organizer
How to Ask Questions the Smart Way FAQ
Andy Zhu
Ranch Hand

Joined: May 26, 2004
Posts: 145
Hey, Mark: thanks for your comments.

As I understand: in real world, it is dbms to handle the check of PK & uniqueness in terms of the meaning of dup key, while the metadata of a database to supply the information of which attribute(s) is PK/UQ. That is, the metadata should have information for PK so that dbms can examine. (Please correct me if I am wrong).

As in scjd, we are facing the mixture of the two. We have a datafile which does contain metadata but doesn't contain PK information. We are given an interface which mimics the minimal functionalities of a dbms. That means to me that we are facing a problem of tight coupling of dbms and db.

Yes, I do think a dbms should handle dupkey. What I meant is the current information supplied in the assignment may not be sufficient to do this. What I think of some alternative right now is to use a metadata of metadatas approach. Let's call it super metadata. It's only known to dbms and neutral to database.

The super acts like this:
1. provide use an option to specify which fields are PK.
2. store this information in the super.
3. dbms use super to check the db.
3a. db already has data: throws ex if these data violates PK so that client will know that this rule can't apply to the db
3b. create: self-evident
3c. update: like create; since interface doesn't throw dup ex, can wrap into sec ex.

assumption: we are handling for one specific database; that is, the dbms is specific.
advantage: separate the issue of dbms and db.
problem: how to tie the individual super with its targeted db.
Solution: super will be cached in the same dir with its targeted db. so it requires that dbms can read/write to that dir (reasonable assumption) with a naming convention like xxx.pk, or so.

but, I doubt this approach is over comlicated for this assignment. Other methods seem more trouble to me. The basic problem is that, if we impose some dup rules by our program that is not specified in the assignment, how we can ensure that sun's testing will follow that rule. My naive guess is that sun actually may not check the dup ex. So why do we implement it?

Thanks
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
I agree with Mark. I used DuplicateKeyException whenever it is appropriate. Basically I throw it when someone try to add a record with the same PK as an existing record, and when someone try to update the PK of a record that matchs an existing record.
One must check for the PK during updating and adding records.


SCJD 1.4<br />SCJP 1.4<br />-----------------------------------<br />"With regard to excellence, it is not enough to know, but we must try to have and use it.<br />" Aristotle
Andy Zhu
Ranch Hand

Joined: May 26, 2004
Posts: 145
Hey, Hanna:

Thanx. Now to your reply, you mentioned PK. Then the question is how you defined your PK. Think of you are using a database, do you dictate the dbms for PK, or the dbms dictates you? The problem is that we don't have a pk at our hand; ie. the metadata of the datafile doesn't have one.

It seems that you impose a PK definition. In reality, it will be fine if customer representative agrees with your definition. But, if they don't agree and they want to set their rule, what will your program do? I don't know for this ex, what sun tests about. database concept?
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
Andy,
You make good points, however to develop any application the developer and the customer must agree on some assumptions. To me, it makes sense to choose business name and location as a primary key, becasue there should not be any two businesses in the same location with the same name. I believe PK choice is left for us to choose and to decide on. It is part of completing the assignment successfully.
In my project, I create evreything dynmically and configurable at runtime. My application can operate on any database file that adhere to the given spesifications. It doesn't matter if the record has one field or ten fields. Even at the GUI side it creates fields according to the DB schema. Becasue of the dynamic designe, I couldn't choose a statice PK, so that I chosed the PK at runtime. I stated in my choice document that if the number of fileds in a record is equal to one field, then the first field is chosen to be the PK, and if the number of fields per record is greater than one, then the PK will be the combination of the first and the second field.
Becasue this is the specification of the application, any client can use it and operate on any database as long as the client understand how it works.

Finally, you must have a PK. You can choose any field or combination of fields to be your PK as long as you can justify it in your choice document.
Paul Bourdeaux
Ranch Hand

Joined: May 24, 2004
Posts: 783
Finally, you must have a PK. You can choose any field or combination of fields to be your PK as long as you can justify it in your choice document.
Yes!

There has been debate in the past regarding the use of a PK or not. Hanna and I are of the same opinion... you must have a PK. Each record in your database is unique in some way. Normally there would be some type of dialog between the developer and client to determine what keys exist in a table. We don't have that luxury with the SCJD, but Sun gives us an out by allowing us to make design decisions and document them.

I have B&S, and I plan to use the name + location as my PK. I actually really like Hanna's solution of dynamically determining the PK, but I also think this is outside the scope of the assignment.


“Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.” - Rich Cook
Andy Zhu
Ranch Hand

Joined: May 26, 2004
Posts: 145
Hey, Hanna, Paul, and Mark:

Thanks for your valuable comments. I will implement a pk and document my argument. one thing left is I also need to check the integrity during the db initialization, except create/update, right? That means, I need to wrap the dup ex of init into some runtime ex.

Hanna, having a dynamic feature is nice when doing this project. But I think it may be over the requirement, as Paul indicated.
Anton Golovin
Ranch Hand

Joined: Jul 02, 2004
Posts: 476
Hi, Andy. Perhaps contrary to the direction of the thread I am of the opinion that no primary key can be found from among the data. I have URLyBird, so I speak from that assignment. The only viable PK is a numeric one, anyway, and that is the record number, which is not a part of the data. In my project, the DuplicateKeyException is thrown in the create method if and only if - and after determining a valid record number to write the new record to - it is determined just before writing that that number already has a valid record written into it. If the create method does its job well, this exception is never thrown, as it should not be. However, if the create method fails, then this exception is thrown.
[ September 23, 2004: Message edited by: Anton Golovin ]

Anton Golovin (anton.golovin@gmail.com) SCJP, SCJD, SCBCD, SCWCD, OCEJWSD, SCEA/OCMJEA [JEE certs from Sun/Oracle]
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17257
    
    6

Anton, you are correct in that area, in that URLybird doesn't really have a PK, whereas B&S does.

Mark
Andy Zhu
Ranch Hand

Joined: May 26, 2004
Posts: 145
Related question to update:

say, if taking name/city as PK, what I should do to check in the update?

1. some said PK is immutable in this db-specific project, then I should verify the old pk with the new pk of that recNo only.

2. However, in general, pk is updatable as long as the new values of pk doesn't have duplicates in the db. so I should check it against all the valid (means not deleted) records.

In addition, we have this db-specific project, will we only allow update owner or other attributes too?

Thanks for any comments.
Anton Golovin
Ranch Hand

Joined: Jul 02, 2004
Posts: 476
Originally posted by Hanna Habashy:
Mark,

I don't know the specification requirments of URLybird. I have a question: Does the database allow for duplicate records? If yes, then you are right, and if no, then you must find a PK. If one cannot find a PK, then he can use a combination of all fields to be a PK.
IMHO, If the databse allow for duplicate records, then there is no needs for PK, and if the database doesn't allow for duplicate records then how can you catch it when someone tries to add the record more than once, or update a record to be exact as already stored record...?


Yep, the assignment is about hotel rooms, which can be as many of a kind as the rooms in a hotel... So, no PK
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
Mark,
Anton, you are correct in that area, in that URLybird doesn't really have a PK, whereas B&S does.

I don't know the specification requirments of URLybird. I have a question: Does the database allow for duplicate records? If yes, then you are right, and if no, then you must find a PK. If one cannot find a PK, then he can use a combination of all fields to be a PK.
IMHO, If the databse allow for duplicate records, then there is no needs for PK, and if the database doesn't allow for duplicate records then how can you catch it when someone tries to add the record more than once, or update a record to be exact as already stored record...?
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
Anton,
Yep, the assignment is about hotel rooms, which can be as many of a kind as the rooms in a hotel... So, no PK

IMO, each room in a hotel is unique. For example, there are no two rooms with the same number on the same floor with the same number of bids. The hotel workers must somehow identify each room...right. Then, there is must be a PK.
What do you think about walking to a hotel, reserve a room, and then the hotel clerk gives you a key and says "you can use any room, they are all the same".
mike acre
Ranch Hand

Joined: Sep 23, 2003
Posts: 197
Hanna,

For the final time. THERE IS NO PK IN UyB!

At the very least it can most definitely be argued that there isn't.

The fields are:

hotel name
location
room size
smoking or not
price
date available
customer ID

when a room is unbooked the customer id field is blank there may then be many records with identical fields.

There is no room number, floor etc.

So no PK.
PK is not a requirement.


SCJP 1.4, SCJD
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
Mike,
As I stated before, regardless of URLbird or any other software project, If the databse allow for duplicate records, then PK is not nesseccarly, and if the database doesn't allow for duplicate records then PK is a must.
One must to read the project specifications carefully and determine for himself.
Paul Bourdeaux
Ranch Hand

Joined: May 24, 2004
Posts: 783
mike,

Thank you for posting the db fields for URLy as it cleared up many of the questions I had about that assignment. I too was assuming that duplicate records were not allowed in the database. You are correct - with this database design there are no candidate keys.

However, the original post was asking about the PK in B&S, in which there are arguably several candidate keys that could qualify as a PK.

Andy,

In regards to your questions,
1. some said PK is immutable in this db-specific project, then I should verify the old pk with the new pk of that recNo only.

2. However, in general, pk is updatable as long as the new values of pk doesn't have duplicates in the db. so I should check it against all the valid (means not deleted) records.

In addition, we have this db-specific project, will we only allow update owner or other attributes too?
Normally I would argue that it is better design to be able to update any attribute. In this project, however, it could be argued either way. Whatever you decide, just be sure to document it in your design decisions document.
Mogens Nidding
Ranch Hand

Joined: Mar 08, 2004
Posts: 77
I don't hope checking for duplicate keys is necessary, because I don't plan on doing it.

Reading this topic, I can see that name/location will make an adequate primary key for the Bodgitt and Scarper assignment, so in hindsight, I probably should have designed for it.

However, the requirements do not state under what circumstances a DuplicateKeyException should be thrown, and they DO assume that any field, including primary key fields, of any record can be updated without causing a DuplicateKeyException to be thrown.

Finally you could argue that name/location is not guaranteed to be a candidate key because a contractor might have several divisions (that are seperately bookable) in the same city.

Anyone got an idea how Sun grades this? Are there people in this forum who passed Bodgitt and Scarper with a good score without checking for duplicates?
[ September 29, 2004: Message edited by: Nicky Bodentien ]
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: DuplicateKeyException revisit