This week's book giveaway is in the Jobs Discussion forum.
We're giving away four copies of Soft Skills and have John Sonmez on-line!
See this thread for details.
The moose likes JDBC and the fly likes Nulls in columns or not?  Frequent DB design standards issue. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "Nulls in columns or not?  Frequent DB design standards issue." Watch "Nulls in columns or not?  Frequent DB design standards issue." New topic
Author

Nulls in columns or not? Frequent DB design standards issue.

Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
Welcome, Lynn!

Does the book (or you) have any position on allowing null column values when designing database tables? There always seems to be a difference of opinion whether they should be allowed in the database or whether the columns should be set to a default value.

At my shop, our DB guideline is to allow nulls only on columns defines as date data types since a default date of all zeros is not allowed in ANSI SQL. All other non-date data type columns require a default. It sure makes queries a lot easier to code when you don't have to worry about nulls, but once in a great while I run into a situation where the default value of 0 in a numeric field is a problem since it could also represent valid data.

What is your opinion on this issue?

Thank you kindly
[ October 16, 2007: Message edited by: Charles McGuire ]

There's no place like 127.0.0.1
Lynn Beighley
author
Ranch Hand

Joined: Sep 12, 2007
Posts: 82
I would say I'm strongly anti-null, and the book does bias in that direction.


Lynn Beighley<br />Author, Head First SQL
Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
Oh cool. A fellow anti-nullian!

If you also have a bias against ORM bloat... it's true love.

;)
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61752
    
  67

Originally posted by Lynn Beighley:
I would say I'm strongly anti-null
Lynn, can you explain why? I've often heard this viewpoint and have no reason to doubt its wisdom, but it'd be nice to know the logic behind the wisdom.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Originally posted by Bear Bibeault:
Lynn, can you explain why? I've often heard this viewpoint and have no reason to doubt its wisdom, but it'd be nice to know the logic behind the wisdom.


And also, what do you do when there's a difference between a default value and data simply not being provided? I believe an example was given that you could use 0 as a default value for a numeric field, but what happens if that default value could be a valid bit of data. Then, do you end up with code constantly checking for various "default values" to see if a field has been supplied, or not? With null, you at least know that there's nothing there - you don't have to worry about if what's there is the default or something else.

Perhaps I haven't phrased that well, but I would like to know more about your reasoning on being generally "anti-null". I'm no database expert so I probably couldn't argue either way, but I'd like to hear your stance on it.


SCJP Tipline, etc.
Lynn Beighley
author
Ranch Hand

Joined: Sep 12, 2007
Posts: 82
Sure, I'll expand on that. I will agree there are instances when using NULL is justifiable, but in general I avoid them. As you suggest, if 0 is a viable value for a column you can't very well use it for a default. That being said:

My biggest problem with using NULL is that it means "I have no value." You can't compare one NULL to anonther, making it impossible to compare two NULL values in a column.

I think allowing NULLs also leads to lazy data entry. I'm all for rigidly enforcing values when there is only a finite set that can fit into a field. It's not necessarily that *you* as the database designer would enter bad data, but once your little table goes out into the wild, you need to put as many constraints on it as reasonably possible. People can and will abuse your database if you let them.
Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
I'd also like to hear Lynn's reasoning, but from my perspective.

Null values totally screw up queries. In our DBMS you can't test if something is equal to null.

That is not a valid statement. Something can't be equal to null. I can write it the following way:

That will work, but now I have a completely different operator. Some query tools also puke on nulls. What happens when I map those elements to a program? I have a different test, I can't assign one field to another if the field is null. It will blow up. That means more compound conditional statements.

Life is much easier without nulls, IMO.
Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
Oops... Lynn replied while I was composing. Didn't mean to step on you.
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Originally posted by Lynn Beighley:
I think allowing NULLs also leads to lazy data entry. I'm all for rigidly enforcing values when there is only a finite set that can fit into a field. It's not necessarily that *you* as the database designer would enter bad data, but once your little table goes out into the wild, you need to put as many constraints on it as reasonably possible. People can and will abuse your database if you let them.


Okay, so let me toss out an example and tell me what you think of it and how you might handle this.

I'm working on an application in which physicians will be entering reviews of various hospitalizations. Quite honestly, doing the entire review process can take an hour or more (not because the data entry is particularly overwhelming, but because they have to look into multiple other systems to find the information they're after).

Now, near the bottom of that page is a drop down field which is populated with options that come from a database table. It's a required field, but, due to the length of time in which it may take to complete a review, the physicians can't always be expected to complete a review from start to finish without being interrupted and having to leave. To that end, the physicians can "Save" their work at any time to the database so that they can come back to it later.

Now, let's say a physician fills out half the information and leave the drop-down list in question blank. That person has to leave so they do a quick save and away they go. What would you populate into the database field for that drop down selection? In my application, it remains NULL until the user actually selects something.

I agree that NULL values can make life harder (and I have written code just to handle nulls, which I don't care for), but I don't see a cleaner option in this case. Is there a cleaner option that I'm simply not seeing?

Thanks.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61752
    
  67

In this case, I don't see the need for a NULL. There are a handful of valid values that the dropdown can take, choosing another one to serve as "unanswered" (even if it's a blank string) doesn't seem like an issue.

Where I think issues come up more is with numeric fields where setting aside an otherwise valid "magic number" to mean "empty" uses up that value. Similiarly with date values.
[ October 16, 2007: Message edited by: Bear Bibeault ]
Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
With date values, at least those that are represented by SQL date data types, you really have no choice. You have to use NULL, unless you want to use 0001/01/01 which really is repugnant.

For char/varchar/string types, the no-null mantra is really much easier. Empty set is preferable, or in the doctor case above, something like "unanswered".

Numerics are a bit tougher. One application had a numeric field for temperature setting in a refer container. Since they were shipping dry goods, the refer unit was not to be turned on. The default for a numeric data type is zero, and that's what they thought the temperature for the refer to be because they were depending on that field for the temperature and an indicator for refer setting. By using the one field, they accidentally froze dry goods and ruined the whole load.

The fix for that was to have a second column to indicate the requirement (Dry/Refer). The applications looked to that flag for guidance, not the temperature. It sounds like a hassle, but it was no more work than allowing nulls 'cuz it had to be tested either way.

In other words, in the rare cases where data is missing or that zero cannot represent missing, I've found it better to indicate that condition with something more explicit than NULL.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61752
    
  67

Originally posted by Charles McGuire:
I've found it better to indicate that condition with something more explicit than NULL.

Very good point. I think all too often coders rely on implicit information (this applies to values outside of DBs as well) rather than more explicit indicators. (My most horrifying example: "I want to make a conditional decision based on the name of the JSP".... ye gods!)

Of course, if you're not careful you could trip up normalization...

If only there were a book that covered this sort of thing. :p Lynn?
[ October 16, 2007: Message edited by: Bear Bibeault ]
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31050
    
162

Corey,
That's an interesting scenario. My first instinct is to allow the nulls and have a status field to show whether the row is completed. Then I started thinking about whether I really have an "in progress" table and a "completed" table where the "in progress" allows nulls and the "completed" doesn't. Does this sound more complicated than necessary.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Charles McGuire
Ranch Hand

Joined: Jan 18, 2005
Posts: 99
... or Corey can take the author's guidance and avoid nulls in his design.

Q1: Assume one row has two columns (COL03 and COL05) that contain null values. What is returned in the result set with the select statement

Q2: Assume two columns of an integer data type, COL03 (nulls) and COL04 (no-nulls). When COL03 is NULL, What is returned in MYRESULT by the statement

In DB2, the answers are:

Q1: Nothing. Nulls can't be compared, so a null != null.
Q2: MYRESULT is null.

I don't know if these results are SQL standard or not, but it partly why I avoid nulls. It complicates things.
[ October 17, 2007: Message edited by: Charles McGuire ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Nulls in columns or not? Frequent DB design standards issue.