Q

Using generated numbers for primary keys, part 2

Here is the continuation of Rudy's answer. See part 1.

A self-identifying key allows you to recognize which entity it represents. For example, Title would be a suitable primary key for a Book table because it pretty much identifies each book (unless your database might hold more than one book with the same title, in which case you'd need to include Author in a compound primary key). If you had a BookOrder table, the foreign key to Book would be Title (or Title plus Author), and you wouldn't need to join BookOrder to Book to figure out which book the order was for. Thus, a self-identifying key has the advantage of simplifying queries. A surrogate key has no meaning, and always requires joins.

But Title and Author are bulky, because realistically they would be VARCHAR(100) or something. This means all foreign keys would be bulky too. Most physical database designers frown on bulky foreign keys, especially if there are many related tables. Notice that there's nothing wrong with a bulky primary key if it has no related foreign keys!

Sometimes natural keys can repeat. It's quite common for two people to have the same name, so FirstName and LastName are usually inadequate as the primary key for an Employee table.

Sometimes natural keys can be null. A recently hired employee, prior to being issued an employee number, might still need to be entered into an employee table. It is common in this type of situation for a table to have a surrogate, generated key, so that the code number (employee number), which is not the key, can be null. So even where there's a column like employee number, which in this context would be natural (although it isn't particularly self-identifying), you'll often also have a surrogate key.

Finally, natural keys do change. When they do, the change must propagate to related tables if the natural keys were used as primary keys.

Given all of the above, are there any situations where natural keys are good primary keys? Sure, all the time. Consider a table of countries. My choice for primary key would be the two-character country code, because it's self-identifying, and because a surrogate key would be, at least to me, pretty useless. But rarely do you see an actual country table. More often, you just see the two-character country code used in other tables. It's really a foreign key, when you think about it. Even the most fanatical DBAs, who (quite rightly, in my mind) would insist on surrogate keys for every table, probably won't notice that natural country code foreign key slip past them.

For additional information about surrogate keys, see

 

For More Information


This was first published in June 2002

Dig deeper on Oracle and SQL

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close