This week's book giveaway is in the Mac OS forum. We're giving away four copies of a choice of "Take Control of Upgrading to Yosemite" or "Take Control of Automating Your Mac" and have Joe Kissell on-line! See this thread for details.
I may have asked this before, but can't seem to find my post if I did. So I will just ask again. Say I have a table called departments_table and then I have another table called employee_table. Employees belong to departments. So employee_table will have a column called, you guessed it, department. What I am wondering is, is it better to store the actual department name in the employee_table or should I store an ID from the department_table as a foreign key? Or does it really matter? Also, if I am using something like Hibernate, is this going to make a difference in terms of what Hibernate might prefer. (This part of the question might be better asked in the ORM forum, but...) Thanks.
Gregg, The general rule that I use is, if you know absolutely that there is only one of each department in the db, then go with the department name. This is to say that no one administering the db enters a department like
and then one like
. These are obviously the same department, but casing is changed. If you are not sure, and there is the possibility of data redundancy, go with the departmentID PK/FK.
What I am wondering is, is it better to store the actual department name in the employee_table or should I store an ID from the department_table as a foreign key?
I would definitely go with storing the ID for a couple main reasons: 1. Space savings - it would take more space in the table to store the dept name than just a simple integer (ID). For example, what if you have a dept name of "Information Technology". That would mean the "department" column in your employee table would have to be at least a char(22) or larger. Moreover, if you end up needing department info in other tables too, then you have to keep adding large char columns. 2. Chances are that your department table is going to have other columns other than just dept name so I'm sure at some point you will be running queries that join one or more tables to department_table. Usually, a join is faster on an integer type column rather than a char.
Blake Minghelli<br />SCWCD<br /> <br />"I'd put a quote here but I'm a non-conformist"
I would also go with the foreign key being a separate id. In addition to the benefits posted by others above, you can handle change better. If a department decides to change its name, it is much easier to have that piece of data in one place.
Thanks for all the info! I really appreciate it. I think I will be going with the department ID in the employee_table for now. Another question about this... Does it matter in terms of the db schema and how obscure it can become to have foreign keys filling up a single table rather than actual data? How is the performance in terms of having to get data from 2 or more tables rather than getting all the data from a single table? Thanks.
Joined: Apr 25, 2003
Both of the above I agree with....and the majority of the time, your table will have additional columns. However, if it is a simple lookup table, there is no problem with using the department name. The thing is that, most of the time, your program is not the only one accessing the table. In this case, some other program may want to add additional fields to the table. In this case, you are required to use a departmentID as not to disturb other program access. Most businesses assign unique department ID's anyway, so most likely you are going to go with departmentID rather than department name.
You normalize a database in order to ensure data consistency and stability, to minimize data redundancy, and to ensure consistent updatability and maintainability of the data, and avoid update and delete anomalies that result in ambiguous data or inconsistent results.
Basically, it addresses various ways in which we may look for repeating data values in a table. There are several levels of the Normal Form, and each level requires that the previous level be satisfied. Right now I believe there are at least 333 rules, but most database designers should try to at least adhere to the first 3 rules or 3N, inorder for a relational database to be considered "normal". First Normal form: Reduce tables to first normal form (1NF) by removing repeating or multivalued columns to another, child table. 1: Entries in columns are single-valued. 2: Entries in columns are of the same kind. 3: Each row is unique. 4: Sequence of columns is insignificant. 5: Sequence of rows is insignificant. 6: Each column has a unique name
Second Normal Form: Reduce first normal form tables to second normal form (2NF) by removing columns that are not dependent on the whole primary key. Make sure each column is kept with the table that it describes.
Third Normal Form Reduce second normal form tables to third normal form (3NF) by removing columns that depend on other, nonkey columns (other than alternative keys). This basically means that we shouldn't store any data that can either be derived from other columns or belong in another table. I hope this helps. Craig
author & internet detective