Ask the Expert

Using generated numbers for primary keys, part 2

Here is the continuation of Rudy's answer. See part 1.

    Requires Free Membership to View

A self-identifying key allows you to recognize which entity it represents. For example, Title would be a suitable primary key for a Book table because it pretty much identifies each book (unless your database might hold more than one book with the same title, in which case you'd need to include Author in a compound primary key). If you had a BookOrder table, the foreign key to Book would be Title (or Title plus Author), and you wouldn't need to join BookOrder to Book to figure out which book the order was for. Thus, a self-identifying key has the advantage of simplifying queries. A surrogate key has no meaning, and always requires joins.

But Title and Author are bulky, because realistically they would be VARCHAR(100) or something. This means all foreign keys would be bulky too. Most physical database designers frown on bulky foreign keys, especially if there are many related tables. Notice that there's nothing wrong with a bulky primary key if it has no related foreign keys!

Sometimes natural keys can repeat. It's quite common for two people to have the same name, so FirstName and LastName are usually inadequate as the primary key for an Employee table.

Sometimes natural keys can be null. A recently hired employee, prior to being issued an employee number, might still need to be entered into an employee table. It is common in this type of situation for a table to have a surrogate, generated key, so that the code number (employee number), which is not the key, can be null. So even where there's a column like employee number, which in this context would be natural (although it isn't particularly self-identifying), you'll often also have a surrogate key.

Finally, natural keys do change. When they do, the change must propagate to related tables if the natural keys were used as primary keys.

Given all of the above, are there any situations where natural keys are good primary keys? Sure, all the time. Consider a table of countries. My choice for primary key would be the two-character country code, because it's self-identifying, and because a surrogate key would be, at least to me, pretty useless. But rarely do you see an actual country table. More often, you just see the two-character country code used in other tables. It's really a foreign key, when you think about it. Even the most fanatical DBAs, who (quite rightly, in my mind) would insist on surrogate keys for every table, probably won't notice that natural country code foreign key slip past them.

For additional information about surrogate keys, see

 

For More Information


This was first published in June 2002

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: