Ed. note: This was submitted in response to a recent question posed to our SQL guru Rudy Limeback, Using date as...
a primary key.
Here are seven quick rules for designing robust and error-free primary keys for Oracle databases:
1) Universal: Every row has a value in that column; not null.
2) Unique: No two rows can share the same value.
3) Short as possible while preserving uniqueness: The shorter the possible, the less space it takes to store in the PK column, then the savings are multiplied by each of the foreign key (FK) references.
4) Numeric: Numeric values take roughly half the storage space as beyond-numeric values. Numeric, by definition means shorter, thus supporting Rule #3.
5) Not subject to change: If a PK value must change, then all of the FK references must change, as well. Avoidable change is a processing waste.
6) "Ours": You do not want to use someone else's PK scheme. If their scheme changes, so must your scheme, thus breaking Rule #5.
7) "Stupid": Good Primary Keys should not be intelligent. Specifically, they should not contain intelligence or have any special meaning...Meanings can change, therefore again breaking Rule #5.
Now, let's now apply these standards to Mr. Limeback's assertion that using a date as a primary key is an acceptable practice:
Rule 1: (Compliant) Every row can certainly have a date associated with it.
Rule 2: (Non-compliant) Using Oracle's one-second granularity (or even sub-second granularity), two rows can certainly have the same date/time value, thus not unique without combination with some other "tie breaker" attribute. (See "Rule 3" regarding tie-breaking combinations).
Rule 3: (Non-compliant) Using dates that are likely not unique in high-volume applications implies combination with some other tie-breaking attribute. Using Oracle's 6-byte date columns in combination with some other tie-breaking attribute certainly causes such a Primary Key scheme to produce values that are longer than necessary, certainly longer than simply using a value that a sequence generator produces.
Rule 4: (Debatable) Although Oracle stores dates in an internal-numeric format, Oracle dates occupy, at minimum, 6 bytes. That same storage allocation could hold an 11-digit numeric value. Most applications do not require an 11-digit number to ensure uniqueness. And when considering the need for combining a date field with some other column to ensure uniqueness, then the Rule 4 issue is no longer debatable...it is certainly non-compliant.
Rule 5: (Debatable) If a Quality Assurance person for an application determines that the date associated with a row is incorrect, then to correct the mistake, the Primary Key must change, thus cascading its effect across all Foreign Keys in child rows. At that point, Rule 5 is no longer debatable, it, too, is non-compliant.
Rule 6: (Debatable) Mr. Limeback asserted this debate when he presented the many different methods for recording dates, Julian, Gregorian, et. al. and issues such as Y2K, et. al. Dates are not really "Ours"...they are subject to rules beyond our control.
Rule 7: (Non-compliant) Dates certainly are not "Stupid"...They have significant intelligence upon which a developer may base logic. Using a Primary Key to store more information than simply unique identity is a dangerous proposition.
In the final analysis, Primary Keys should simply serve as unique identities for rows. There is no need to look beyond a simple sequence generator for unique values. There is certainly no need to "double up" a Primary Key to serve other purposes. To do so is a dangerous proposition.
Rudy Limeback responds:
Probably the first thing that I should point out is that, despite my column being located on the SearchOracle.com site, it is not about Oracle SQL, but rather, about SQL, the standard language.
In this context, I chose to interpret the original reader's question about using a date as a primary key, to mean a value such as 2005-11-17. In standard SQL terminology, this is a date value. You were perhaps thinking of datetime values, which include hours, minutes, seconds, and some fraction of seconds, but I was speaking only of date values.
If you would kindly re-read my column with this distinction in mind, perhaps it will make more sense. I am sorry I was not sensitive enough to the fact that so many Oracle people might be misled (you were not the first to raise an issue with this column).
If each row in the table is to represent a specific day, I can think of no better primary key than the "natural" key which is the Gregorian calendar date. I would like to emphasize once again that I am talking about a date, not a datetime.
I am not intimately familiar with Oracle but if there is no distinction in Oracle between a date and a datetime (there isn't in Microsoft SQL Server, either, all you can use is DATETIME), then I suppose you could still use Oracle's DATE format, provided that the time portion was set to, say, midnight.
In other words, my point is that if you want one value per date, the Gregorian calendar date is a really good PK for this, and there is no need, nor even the slightest hint of desirability, to substitute some other key.