What you don't know about denormalization can hurt you, Part I

Better to know more about denormalization than less. See why.

Predictably, I received some dissenting responses to my previous column on normalization and performance. Not only

do they prove my point that there is logical-physical confusion among practitioners, but they also raise yet another issue poorly understood in the industry, namely, database correctness. In response to my argument in an email that without adherence to relational principles such as normalization, database correctness cannot be guaranteed by the DBMS, one reader stated as follows:

I've been in the computer business for over 20 years and fully realize the fallacy in the idea that any database system and/or computer system can "guarantee correctness." It is beyond the function of any database system to guarantee correctness.That being said, the issue comes down to correcting errors in the database. Now we get to the down and dirty, the consequence of errors in the database and the complexity of finding and correcting consequential errors in the database.

In order to assess the validity of my claim it is important, of course, to define correctness in a database context. This I will do in this article. Next month I will demonstrate, by using an example and issuing a challenge, how prohibitive denormalization is from an integrity perspective.

My first article in this series concludes as follows:

DBMS and tool vendors, database professionals and users, desire accurate answers from databases. Yet the vast majority are unaware that, as Hugh Darwen states, a database is a set of axioms; the response to a query is a theorem; the process of deriving the theorem from the axioms is a proof; a proof is made by manipulating symbols according to agreed mathematical rules. The proof, of course, can only be as sound as the rules are. That makes the DBMS a deductive logic system: it derives new facts (query results) from a set of user asserted facts (the database). The derived assertions are true (query results are correct) only if (a) the initial assertions are true and (b) the derivation rules are logically sound. Neither are most practitioners aware that the truth of the initial assertions--the correctness of the database--must be ensured by the DBMS's integrity function, and the correctness of the derivations (query results) by its manipulation function, only if the design of both databases and DBMSs adheres to the sound and fundamental principles of logic.

Assertions of fact are propositions in logic; databases represent propositions about the real world. The representation must, of course, be correct. The correctness of a database consists, therefore, of two components:

  1. the truth of the propositions it represents in the real world (that is, they must, indeed, be facts)
  2. the consistency of row values (rows represent the propositions) with the business rules in effect

Obviously, no DBMS can know whether the propositions are true or not in the real world, which is, I think, what the above reader implies and what Chris Date means when he says that "a DBMS can and should enforce consistency, but can't enforce 100% correctness."

However, it should also be obvious that database values which violate any applicable business rule cannot possibly represent true propositions and, thus, a database containing such values is not correct. Business rules are represented in the database by integrity constraints. It follows that the least and the best a DBMS can do to guarantee database correctness is to enforce the integrity constraints declared to it. Date's consistency is, therefore, for all practical purposes, DBMS-enforced correctness. And it is in this sense that the theoretical foundation--predicate logic and set mathematics--underlying RDBMSs and relationally designed databases guarantees database correctness. Conversely, DBMSs and databases lacking a proper foundation in logic cannot guarantee such correctness (see Chapter 2 in my "Practical Issues in Database Management" book).

Stay tuned for Part II.

About the Author

Fabian Pascal has a national and international reputation as an independent database technology analyst, industry critic, consultant, author and lecturer. For more than 13 years he held various analytical, and management positions in the private and public sectors, was affiliated with Codd & Date, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology and implementation. He is co-founder and editor of Database Debunkings, a web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, with C.J. Date as senior contributor. He has contributed extensively to most trade publications, and his third book, Practical Issues in Database Management--A Guide for the Thinking Practitioner, was recently published by Addison Wesley.

For More Information


This was first published in April 2001

Dig deeper on Oracle DBA jobs, training and certification

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close