File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes JDBC and the fly likes What are the pros and cons of Referential Integrity Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "What are the pros and cons of Referential Integrity" Watch "What are the pros and cons of Referential Integrity" New topic
Author

What are the pros and cons of Referential Integrity

Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Now I've been coding for years... and while I do like RI at times, I find that sometimes it can really get in the way. A friend of mine was coding at the department of justice, and he had to convert a very large database over to a new system. The first thing he did was rip out the RI. After that he ended up putting the restrictions in the code. Why did he do it that way. I believe it was because of the dirty data. Also it was much easier to work w/rather than going back to change the data every time he had to change something from above. I'd like to hear your comments about why it is or isn't a good idea to use referential integrity in a database.


By failing to prepare, you are preparing to fail.<br />Benjamin Franklin (1706 - 1790)
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30537
    
150

Dale,
Personally, I like the database enforcing RI. It prevents things from getting messed up by someone who accesses the database directly rather than going through the program. It also helps when you have many different programs going against the same set of tables. If even one of them forgets to code the rules in, the data is out of synch. And if the rules changes, you have to update in multiple places.

Some cons are that it is harder to "componentize" if you are relying on the database to do something rather than your code.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Kevin Mangold
Greenhorn

Joined: Jan 13, 2005
Posts: 18
I never add referential integrity unless I have to. And the "have to" depends on if other developers will be updating the database also. I find it easier to do my own checks within my code than to let the database management system do that for me, because there are some times where I want to perform some action that would make the DBMS complain.

It is good design though to add checks, triggers, and all that jazz. Because, sadly, not everyone is perfect and it might get messed up somewhere down the line.

I've been asked this question before and the first thing I do, before I give an answer, is ask questions about the situation.
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


It is good design though to add ... triggers

I'd suggest its bad design. My experience is triggers tend to get added as workarounds. Triggers often equal application logic, and that has no place in the data model (wit ha small amount of special cases). For one thing you are obscuring what the data model represents. For another you are increasing the maintenance required in the database and (possibly - depending on how you write them) making transfering a schema from one database type to another a much bigger deal. You make debugging your application harder. Spreading the applcation logic round more tiers in the application can also make it much harder to make infrastructure changes in response to performance problems.

Jeanne's point about the rules being in one place being a good thing is my experience too. If you have (like we do) a plethora of applications written over the last eleven years is a huge variety of technologies all connecting to the same schema you really need the data model well defined and properly enforced using referential integrity. To my mind, its a very special case that you might put data in a relational database that does not use the basic building blocks of relational data modelling. If you don't want your data to behave in a relational way, why use an RDBMS at all?


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
steve souza
Ranch Hand

Joined: Jun 26, 2002
Posts: 861
<< If you have (like we do) a plethora of applications written over the last eleven years is a huge variety of technologies all connecting to the same schema you really need the data model well defined and properly enforced using referential integrity. To my mind, its a very special case that you might put data in a relational database that does not use the basic building blocks of relational data modelling.>>
Here, here. I completely agree.

I've seen many databases that don't have referential integrity enforced by the db, but by the application code itself. One application had 100,000 "psuedo duplicate" records. I say "psuedo duplicate" as some values in the columns were different, but not all. This made it impossible to identify what was the real data. Not what you want to have to explain to your users.

You can't guarantee that down the road someone doesn't want to get at your code via a batch process, or ruby, or python, or another java application. Also by coding RI in the client you (and other developers) must be diligent when updating a table from multiple spots in your code. If you put the RI in the database the rules are always enforced.

In my opinion DB enforced RI is a 'mandatory' best practice if you don't want bad data to creep into your database.
[ April 20, 2007: Message edited by: steve souza ]

http://www.jamonapi.com/ - a fast, free open source performance tuning api.
JavaRanch Performance FAQ
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30537
    
150

Originally posted by Kevin Mangold:
I never add referential integrity unless I have to. ... It is good design though to add checks, triggers, and all that jazz. Because, sadly, not everyone is perfect and it might get messed up somewhere down the line.

What would these triggers do that the database can't enforce through referential integrity?
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
I can see your points in added RI into the database rather than putting your rules scattered throughout the multiple programs that access it. It seems to be a clean way to code and to keep all of the basic data rules in one place. I do also see another issue that comes up. I see where people start coding business logic into the database rather than putting it into the business logic layer. {This is straying a bit from RI.} Continuing the discussion, when would it be a good idea to use a trigger and when is it a good idea to pull the logic out and put it into a business logic layer? I DO use triggers to do certain things like update a MODIFIED_TIMESTAMP field or things of a similar nature. Other than that, I typically stay away from triggers and stored procedures.
steve souza
Ranch Hand

Joined: Jun 26, 2002
Posts: 861
Typically I have used triggers mostly for things like RI, or updating a timestamp etc. Pretty generic things, and avoided putting too much related to business rules in them. Mainly this is due to the fact that most people won't think to look in them to see what business rules are being implemented. I have less of a concern with stored procs as their execution is explicit.

Stored procs have pros and cons. They are easy to test, and can be called from any client language easily (powerbuilder, vb, batch processes, java, ruby etc). So they are nicely independent in that way. However, the language tools are typically only sql which isn't nearly as rich as java. A downside of them is that they are RDBMS dependent, so if you use them you should abstract them in java code. They are particularly good when temp tables or other complicated processing has to be done before returning a particular resultset. Returning these intermediate results to the client tier and then manipulating them can be difficult. Some people seem to have gone to the point where stored procs are evil. I think they are great when used and abstracted properly.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30537
    
150

I understand what the trigger discussion was about now. RI and timestamps make sense as they are database level concepts.

Whether to use stored procs depend heavily on your situation. For example, how many clients are there, how many queries are needed for a logical operation, would temp tables be of use, are you likely to change databases, etc.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: What are the pros and cons of Referential Integrity