I hope this isn't too basic of a question, but I am confused about Primary Keys. In a tutorial example I just read, they provided a list of fields in a BookProducts table: InventoryDate; InventoryQuantity; Title; LastName; FirstName; YearPublished; Publisher; UPC; ISBN; and RetailPrice. I selected the ISBN (International Standard Book Number) as the logical primary key, because it is unique to a book. The tutorial presented a Composite primary key that didn't include the ISBN (see below). Their explanation: Since each title and UPC can have more than one author and each author can have more than one title, you need a combination of LastName, FirstName, and UPC fields to create the unique primary key for the BookProducts Table. This totally confused me. Can you help? Why isn't the ISBN a good primary key by itself?
My second quick question is this: Are there any tables for which you don't need a primary key? I read that you need a primary key for each base table in a database. What other kinds of tables are there besides base tables, and do they need primary keys? Also, is there a good reference for understanding primary keys? My Beginning SQL Programming book is no help. Thanks for your time.
You are right, ISBN is unique for each book, but that would make it a good primary key only for the Book table, where presumably there is only one row per book. In the BookProducts table, there's obviously more than one row per book. Even LastName, FirstName, and UPC are not sufficiently unique to be the primary key, if I understand the design of the BookProducts table correctly. It seems to be possible to take inventory, i.e. count the number of copies of a book, on multiple dates. Therefore the same book will show up in the table on each inventory date that there was at least one copy in stock. So the composite primary key would have to contain the InventoryDate column as well.
The classic example of a missing primary key is the hypothetical table that holds the items on a grocery store cash register receipt. I learned of this from Joe Celko, who called it the "three cans of cat food" problem. Disregard for a moment that the three cans might have been rung up as a group (3 cans @ .98 = $2.97) and assume they are in the table each as a separate row. These three rows will have identical values in all columns that matter, unless you add something like a timestamp with precision down to a fraction of a second -- and a timestamp is an unnecessarily cumbersome field to use just so that you can declare a primary key.
I'm not sure how your "base tables" are defined, but in my mind there is no reason to declare a primary key unless you need another table to reference it via a foreign key. However, since most tables are related to other tables, primary keys become commonplace and thus perhaps they are assumed necessary. A table does not have to have a primary key. Uniqueness can be guaranteed by a constraint (often implemented by the database as a unique index).
See also my previous answer What is a primary key? Composite PK? Foreign key? Tuple? 27 February 2002.
This was first published in April 2002