File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes JDBC and the fly likes What is your general practice to generate employee IDs? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC
Bookmark "What is your general practice to generate employee IDs?" Watch "What is your general practice to generate employee IDs?" New topic
Author

What is your general practice to generate employee IDs?

Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598
1) Create a stored procedure and getting the maximum number out of it, and set it back into the database

Having the employeeID field auto-increment, this throws an SQL exception...


2) Let it be auto-increment.
Have to worry about this, this may turn out to be a complex mix of alphabets and numbers

Which way do you prefer?

Thanks
Jack
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

You should definitely use autoincrement (or a sequence in databases that use them instead of autoincrement). It is scalable, robust, and most portable way of doing it.

On the other hand, obtaining current maximum is non-scalable (requires some sort of locking), less effective (requires one more roundtrip to the database), fragile and not readily portable (in some databases you need to use correct isolation level, in other databases, you even need to explicitly lock to avoid duplicates).
Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598
Martin Vajsar wrote:You should definitely use autoincrement (or a sequence in databases that use them instead of autoincrement). It is scalable, robust, and most portable way of doing it.

On the other hand, obtaining current maximum is non-scalable (requires some sort of locking), less effective (requires one more roundtrip to the database), fragile and not readily portable (in some databases you need to use correct isolation level, in other databases, you even need to explicitly lock to avoid duplicates).


If the more effective way is using auto-increment, how do I customize that ID such as one that will combine with
Serial No, Creation Date, dept etc....
0001-20130821-A1
Thanks
Jack
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 1021
    
    5

what database are you using? MySQL?
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

You don't.

The sole purpose of a primary key is to uniquely identify a row. In some rare cases, you can use a natural primary key (that is, you'll use a key which has a specific meaning as a primary key), but only if you're really sure it is immutable. More often than not, the users will eventually find out they need to modify the primary key (just in this exceptional case, of course ), and the can of worms of mutable primary keys is open - good luck. Using record IDs as primary keys is one of the usual ways to mutable primary keys.

If you need to generate some sort of a record ID, you can do it "on the fly" by storing individual parts of the key in the table (or even in other related tables) and just concatenating them to form the desired record ID.

And one more note: sequences and autoincrement IDs will have gaps (skipped numbers). There isn't a practical way to generate gapless sequences. Every at least slightly scalable gapless generation scheme can be thwarted by a rollback. Every possible scheme can be defeated by deletions.
Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598
Martin Vajsar wrote:You don't.

The sole purpose of a primary key is to uniquely identify a row. In some rare cases, you can use a natural primary key (that is, you'll use a key which has a specific meaning as a primary key), but only if you're really sure it is immutable. More often than not, the users will eventually find out they need to modify the primary key (just in this exceptional case, of course ), and the can of worms of mutable primary keys is open - good luck. Using record IDs as primary keys is one of the usual ways to mutable primary keys.

If you need to generate some sort of a record ID, you can do it "on the fly" by storing individual parts of the key in the table (or even in other related tables) and just concatenating them to form the desired record ID.

And one more note: sequences and autoincrement IDs will have gaps (skipped numbers). There isn't a practical way to generate gapless sequences. Every at least slightly scalable gapless generation scheme can be thwarted by a rollback. Every possible scheme can be defeated by deletions.


Should I "store" this transient transaction ids? for easy lookups?
Say the employee really wants to look up an id named 0001-20130821-A10, so should I just break down the request and gather the information as a final piece?
Thanks
Jack
Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598
James Boswell wrote:what database are you using? MySQL?


MariaDB
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

Jacky Luk wrote:Should I "store" this transient transaction ids? for easy lookups?
Say the employee really wants to look up an id named 0001-20130821-A10, so should I just break down the request and gather the information as a final piece?

Some databases can create index on expressions (in Oracle, for example, it is called funcion-based index). That would allow you to implement quick lookups based on such an index. The index would actually contain the IDs, but the database takes care of updating it whenever necessary, which is great.

Some databases support virtual columns, and perhaps can create indexes over them. This is just a slightly bit different way of implementing the previous point. I don't know whether MariaDB supports any of these, though.

All databases support views. You can create a view and construct the ID in it. Using such view may not be efficient for looking up individual IDs though (due to nonexistent indexes over that column). If this is the case, and you need to implement efficient lookup, then you have at least two possibilities:

1) Index individual columns that are part of the ID, and when a concrete ID is requested, break it up into individual columns and construct appropriate WHERE clause. This will probably perform pretty well and doesn't require any maintenance when individual columns that form the ID change. It would be my preferred solution, probably. Implementing general range lookups (from-to) on the IDs might be slightly more complicated in this scheme, though.

2) Store and index the ID in the table. This is actually the imitation of the function-based index or virtual column, and performs equally well. The disadvantage is that you need to update the ID if the underlying columns change, which is easy to forget about and makes your code more messy. Still, I wouldn't make the ID a primary key. You never know when someone will want to have the ID updated, and if you make it a primary key, you'll have to cascade update everything. Though MariaDB probably supports it out of the box, you can imagine that this is not an efficient or scalable operation.
Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598
As MariaDB is derived from mySQL, I think the answer is negative.

http://use-the-index-luke.com/sql/where-clause/functions?dbtype=mysql#overview_function_based_index_support
chris webster
Bartender

Joined: Mar 01, 2009
Posts: 1716
    
  14

As Martin says, use an auto-increment (in MySQL, MariaDB etc, or use a sequence in Oracle, PostgreSQL) column such as EMPLOYEE_ID to hold an internal numeric ID that is guaranteed to be unique. Define this as your primary key. This internal ID is known as a "surrogate key" because it is used instead of what your business users think is the key. You use it within your system instead of the business key because you can be absolutely sure the surrogate key is unique and will not change. The business users need never see this value.

Then have a separate column such as EMPLOYEE_CODE for your "business key" e.g. "0001-20130821-A10". Make this mandatory (NOT NULL) and apply a UNIQUE KEY constraint on it. This will cause a unique index to be constructed on the business key column, which will also speed up searching.

If you have foreign keys pointing back to your Employees table, they should use the surrogate key EMPLOYEE_ID as the foreign key value, because you know this will always be reliable, immutable and unique. This is because, as Martin says, one day your users will come along and say "Remember how we said an Employee ID like 0001-20130821-A10 would always be unique and would never change? Well, we've changed our minds...". This way, the surrogate key will still be OK, even if the rules for the business key change.

When users query for an Employee using the business key, it's easy to write the SQL to do something like "SELECT employee_id, employee_code, first_name, last_name FROM employees WHERE employee_code = '0001-20130821-A10'". If you always return the surrogate Employee ID as well, then you can always use it inside your programs as the real primary key.

Incidentally, if your business key for an Employee is really '0001-20130821-A10', then you might want to think about breaking it up into the constituent elements - 0001 and 20130821 and A10 - and storing these in separate columns if these elements have meaning individually. Holding formatted string values like '0001-20130821-A10' in a single column means you always have to check the format e.g. what if users type in '000120130821A10', '0001/20130821/A10' or '0001 20130821 A10' instead? Or you can break out your Regular Expressions handbook...


No more Blub for me, thank you, Vicar.
Jacky Luk
Ranch Hand

Joined: Aug 02, 2012
Posts: 598


What is the equivalence for this for sqlite?

update, i solved it by myself

But what if the database initially contains no rows?

from customer c
Thanks
Jack
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

A select max() from statement should always return one row. If there are no rows in the table, the selected value will be null. So you need to handle cases where MAX(id) happens to be null, by providing a default value - presumably 0 - for it using the proper function in your SQL dialect.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: What is your general practice to generate employee IDs?