• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Design Question

 
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I may have asked this before, but can't seem to find my post if I did. So I will just ask again.
Say I have a table called departments_table and then I have another table called employee_table.
Employees belong to departments. So employee_table will have a column called, you guessed it, department.
What I am wondering is, is it better to store the actual department name in the employee_table or should I store an ID from the department_table as a foreign key?
Or does it really matter?
Also, if I am using something like Hibernate, is this going to make a difference in terms of what Hibernate might prefer. (This part of the question might be better asked in the ORM forum, but...)
Thanks.
 
Ranch Hand
Posts: 100
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Gregg,
The general rule that I use is, if you know absolutely that there is only one of each department in the db, then go with the department name. This is to say that no one administering the db enters a department like

Accounting

and then one like

accounting

. These are obviously the same department, but casing is changed. If you are not sure, and there is the possibility of data redundancy, go with the departmentID PK/FK.

Best of Luck!
 
Ranch Hand
Posts: 331
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

What I am wondering is, is it better to store the actual department name in the employee_table or should I store an ID from the department_table as a foreign key?


I would definitely go with storing the ID for a couple main reasons:
1. Space savings - it would take more space in the table to store the dept name than just a simple integer (ID). For example, what if you have a dept name of "Information Technology". That would mean the "department" column in your employee table would have to be at least a char(22) or larger. Moreover, if you end up needing department info in other tables too, then you have to keep adding large char columns.
2. Chances are that your department table is going to have other columns other than just dept name so I'm sure at some point you will be running queries that join one or more tables to department_table. Usually, a join is faster on an integer type column rather than a char.
 
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I would also go with the foreign key being a separate id. In addition to the benefits posted by others above, you can handle change better. If a department decides to change its name, it is much easier to have that piece of data in one place.
 
Gregg Bolinger
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for all the info! I really appreciate it. I think I will be going with the department ID in the employee_table for now.
Another question about this...
Does it matter in terms of the db schema and how obscure it can become to have foreign keys filling up a single table rather than actual data? How is the performance in terms of having to get data from 2 or more tables rather than getting all the data from a single table?
Thanks.
 
Jason Steele
Ranch Hand
Posts: 100
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Both of the above I agree with....and the majority of the time, your table will have additional columns. However, if it is a simple lookup table, there is no problem with using the department name. The thing is that, most of the time, your program is not the only one accessing the table. In this case, some other program may want to add additional fields to the table. In this case, you are required to use a departmentID as not to disturb other program access. Most businesses assign unique department ID's anyway, so most likely you are going to go with departmentID rather than department name.
 
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My $0.02.

You normalize a database in order to ensure data consistency and stability, to minimize data redundancy, and to ensure consistent updatability and maintainability of the data, and avoid update and delete anomalies that result in ambiguous data or inconsistent results.


Basically, it addresses various ways in which we may look for repeating data values in a table. There are several levels of the Normal Form, and each level requires that the previous level be satisfied.
Right now I believe there are at least 333 rules, but most database designers should try to at least adhere to the first 3 rules or 3N, inorder for a relational database to be considered "normal".
First Normal form:
Reduce tables to first normal form (1NF) by removing repeating or multivalued columns to another, child table.
1: Entries in columns are single-valued.
2: Entries in columns are of the same kind.
3: Each row is unique.
4: Sequence of columns is insignificant.
5: Sequence of rows is insignificant.
6: Each column has a unique name

Second Normal Form:
Reduce first normal form tables to second normal form (2NF) by removing columns that are not dependent on the whole primary key.
Make sure each column is kept with the table that it describes.

Third Normal Form
Reduce second normal form tables to third normal form (3NF) by removing columns that depend on other, nonkey columns (other than alternative keys).
This basically means that we shouldn't store any data that can either be derived from other columns or belong in another table.
I hope this helps.
Craig
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Gregg,
Performance is comparable assuming you have an index on the primary key (which most databases give you by default.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic