aspose file tools*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Bodgit ans Scraper DuplicateKeyException Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Bodgit ans Scraper DuplicateKeyException" Watch "Bodgit ans Scraper DuplicateKeyException" New topic
Author

Bodgit ans Scraper DuplicateKeyException

Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
hi all:
I understand that we should choose a key for each record, so that no duplicate record entries are added to the database. In my design, I don't use any static information at all. Everything is read dynamically, becuase of that, I don't store field names, and hence I cannot choose a combination of one or more fields to be the key. My only choice is to choose field index numbers. Such as, the first field and the second field will be the key, without knowing what kind of information it they hold.
Doesn't anyone can tell me about any better approch.
thanks


SCJD 1.4<br />SCJP 1.4<br />-----------------------------------<br />"With regard to excellence, it is not enough to know, but we must try to have and use it.<br />" Aristotle
Martin Rea
Greenhorn

Joined: May 04, 2004
Posts: 12
Hi Hanna,
I am not sure I understand the issue but here is how I have tackled it:
The primary key is simply the record number - that is the appearance in the DB file.
When deleting a record there will not anymore be an entry with that particular record number (say record nb 3 is deleted only 1,2, 4... is left)
When creating a record I just reclaim the deleted record or create one at the end of the DB if none has been deleted.
How can I compare if 2 records are the same? I state in my documentatio and code that I do not have the knowledge to judge but it does not give any meaning to have 2 records that holds exactly the same values.
But it might give meaning to have exactly the same entries only differing in price (for instance)
Therefore I compare all fields to all fields when creating and throw a RecordNotFoundException if a match is found - otherwise I crate it.
I don't know if this was what you intended but this is my approach to tackle this issue.
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
hi Martin:
The issue is that IMO a record number is invalid key. Record number will change if a record is added or deleted, hence it is invalid. Also, if someone try to add a record with a subset of its fields value matches a subset of an existing record, then there will be a duplicate records. for example: if my database contains the record: "Martin" "Developer" "NYC" "$40.00" and
someone try to add a new record with values "Martin" "Developer" "NYC" "$30"
then it is a duplicate entry. A key is n fields that uniquely identify a record.
In my instruction the method create, which add a record to the database, throws DuplicateKeyException, hence I must uniquely identify each record.
I hope I made my point clear.
Greg Till
Greenhorn

Joined: May 10, 2004
Posts: 5
I've been thinking about record Ids and primary keys as well -
Martin,
You say you reuse deleted record slots when inserting a new record. What do you do in case of the following example:
User A gets all records
User B deletes record 1
User C inserts a new record (which overwrites the deleted record in position 1)
User A attempts to update record 1 (which to him still looks like the old record as he hasn't updated his local view)
You wouldn't get a deleted record exception, even though this is really what has happened. Do you have a nice solution to get round it?
Cheers,
Greg
Ben Zung
Ranch Hand

Joined: Mar 25, 2004
Posts: 109
Using record number as key is not convincing enough to me. The assignment does not specify more detailed requests. so I think it is pretty safe to assume your own keys. To me, more natural keys are "name" and "location". it is also fine to add "specialties" to it as long as you state this clearly in the choice.txt file.(I am doing it).
Hanna, You may want to create another object to deal with this. I created a DataControl object within data access system. which has not only the key columns but some other business rules like user enters a text for the size or rate column when creating a new record, and client implementation behaviors like specifying column type so clients display text column wider and number column narrower. So to satisfy the requiremnt of "future functionality enhancement" is supported with "minimal disruption to the users". The future enhancement for example, add a new column info for contractors, would only need add it to above object. All client code does not need to change at all. No code distribution or no disruption at all in this case.
Good luck.
Bing
[ May 12, 2004: Message edited by: Bing Yuen ]
[ May 12, 2004: Message edited by: Bing Yuen ]
Richard Everhart
Ranch Hand

Joined: Nov 19, 2003
Posts: 54
I'm thinking that using the name and location are a good choice to use as the primary key. However, I'm using an integer which is the place in the database file the record appears. I just finished implementing updateRecord(), which throws the DuplicateKeyException, and reading this posting has made me think that perhaps the 'key' in this case really should be the name and location. I say this because my instructions also say that records are to be considered identical if two records share the same name and locations. Darn! There goes my design so far.
Okay, well not quite. I'm not changing things after getting this far. I would like to comment on one other thing. Another poster mentioned that they actually remove a record when they delete a record from the database. I don't delete a record in this manner. My instructions state in createRecord() that I can reuse a deleted record. So, what I do is set a record to be deleted (deleted = '1', your instructions may vary). Then when I need to create a record I search for the first such deleted record and indicate that its space should be reused. Anyway, that's what I'm doing. I hope this makes sense.
Rich
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
hi guys:
I agree with you all that name and location should be used as primary key. However, it will work if you hardcoded the location and/ or the names of those fields. If the database changes, or new fields are added, the design will fail.
In my design, I don't hard code anything even the number of fields. My schema object reads the databse schema at runtime, then store the information in instanc variables.
I state in my instruction that, if the number of fields is == 1, then the first field is used as primary key, and if the number of fields is >= 2, the first and the secoond field are used as primary key.
I have to add a case if the number of fields is 0, just to be able to handle all possible values.
Richard Everhart
Ranch Hand

Joined: Nov 19, 2003
Posts: 54
Hanna,

You make some good points. I'm using an integer as the primary key, as I mentioned before, but if I were in your situation I'd create some kind of PrimaryKey interface or abstract class and then create a concrete implementation specifically for B&S. This means that if the number of fields changes then the PrimaryKey implemetation would have to change also. But, in this way you'd be isolating the changes to just one part of your code. Of course, make sure you override equals() and hashCode() if you do this. I hope this helps.


Rich
Vishwa Kumba
Ranch Hand

Joined: Aug 27, 2003
Posts: 1064
Originally posted by Hanna Habashy:
In my design, I don't hard code anything even the number of fields. My schema object reads the databse schema at runtime, then store the information in instance variables. I state in my instruction that, if the number of fields is == 1, then the first field is used as primary key, and if the number of fields is >= 2, the first and the secoond field are used as primary key.I have to add a case if the number of fields is 0, just to be able to handle all possible values.


Umm....a different way .....but would you do something like that in a real-life application? With your approach the selected fields do not always form a primary key......My view is that the primarykey database integrity constraint is selected after a substantive analysis of the various columns of the table functionally.
Martin Rea
Greenhorn

Joined: May 04, 2004
Posts: 12
Hi all,
Thanks - there goes my design.....
Anyway I thank you - it is a very interesting discussion and I can just wait some more time before submitting.

I thought I was done with my implementation but I will do some argumentation and hopefully someone will comment to see if it is useable or I have to change my design (and implementation):

1. My current application only supports search and book
2. I have nothing stated in my assignment regarding if name/location is the same then it is an identical record - therefore I assume that if all fields are equal then it is a duplicate record
3. I may assume that only 1 program accesses the DB at a time according to my assignment (nobody using the data access class at the same time my clients does - no alone mode usage when running networked etc.)
4. Therefore a delete can be safely done - no clients using the application at that time - so the situation Greg refers to can never happen
5. I am aware that in the future when the application must support delete / update in generel another approach might have to be implemented
6. I use notifications to tell clients when a record is booked from another client
7. 2 client cannot book the same record - I check to see if the field is booked after taking the lock for the record. If it is - the user will be noticed.

How does such an argumentation sound to you?
Ben Zung
Ranch Hand

Joined: Mar 25, 2004
Posts: 109
In my design, I don't hard code anything even the number of fields. My schema object reads the databse schema at runtime, then store the information in instanc variables. I state in my instruction that, if the number of fields is == 1, then the first field is used as primary key, and if the number of fields is >= 2, the first and the secoond field are used as primary key.


Sorry, Hanna. To me this sounds still like a hard coding. What if later the "owner" column gets moved to be the first field? Your primary key would have gone wrong.

Like I mentioned in my earlier message, a central control of this-like stuff withing the data access system should be a better option. If using RMI, it will only means that you add one or two more methods to get the information from the control.

Besides, I don't think "hard-coding" is totally a bad thing in all cases.
In the central control class, it should be fine to code "name" and "location" to be the primary key columns. For future enhancement, the only place you need to accomdate the changes would be just this one object. To me this sounds not bad.


Just my own thought.

bing
[ May 13, 2004: Message edited by: Bing Yuen ]
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11481
    
  94

Hi everyone,

It is possible that you may find that for the data you have, there may not be any logical primary key. In which case you might decide to declare that you throw the DuplicateException but not actually throw it in your code. Just leave the declaration in place for a later enhancement.

As long as you describe what you did and why you did it in your design decisions document, you should be safe with that.

Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
Hanna Habashy
Ranch Hand

Joined: Aug 20, 2003
Posts: 532
hi guys:
I see that everyone has a good point. If it were a real project, it would be easier to construct a primary key. However, the designer of the database doesn't tell you the structer of its records. This information is put in the schema section. Hence, the number of fields in a record can by n where n is a positive integer. If I choose a field k to be a primary key, what will happen if someone path a database that contains number of fields < k??
the application won't evern function. Choosing a primary key at runtime allwos the same application to function with different database files, that has different record schema. Of course, one has to state the logic behind the choice of the primary key, so that the database builder know where to put the distinctive fields.
I agree, in real life one won't do that. But in real life, a developer will get the architecture of the database, which should not change dramatically.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Bodgit ans Scraper DuplicateKeyException