Problem solve Get help with specific problems with your technologies, process and projects.

What you don't know about denormalization can hurt you, Part II

This is the second part of why you should learn about denormalization.

In Part I, I argued that the least and the best a DBMS can do to ensure database correctness it to enforce the integrity constraints representing the business rules in effect in the real world--and that it is relational theory that guarantees such correctness. While few practitioners deny that databases and query results must be logically correct, many dismiss relational principles such as normalization as "just theory" and, therefore, not practical. This is, of course, an inconsistent position in itself. What is more, consider on what some such dismissals are frequently based, as demonstrated by the following exchange:

I love the sneering critiques written by those crazed Relational Theorists at Database Debunkings.They are worth reading.

I don't. Although some of the quotes are truly hysterical--and thus justify my hourly rate whenever questioned--I don't feel comfortable with the implication that I am somehow a lesser entity because I don't understand all the mathematics involved in relational theory. I'm sure therapy and some remedial set theory would help.

I don't know if therapy would help--probably not--but nothing on our site implies that practitioners without a mastering of the mathematics of relations are "lesser entities." What it does demonstrate is that those who flout relational principles such as normalization in their work are lesser database practitioners. While some familiarity with logic and mathematics would not hurt, it's DBMS designers who must master the theory if they are to construct DBMSs that guarantee correctness--and judging by current products, it does not look as if they do, because practitioners don't demand it.

Let me now demonstrate the practical importance of normalized databases for correctness--and put your grasp of critical database fundamentals to test--with an example and a challenge. In Figure 1, table A and tables B1 and B2 are two representations of assignments of employees to projects and to project activities. Assume the following business rules are in effect in the real world:

  • an employee can be assigned to any specified project and to all of the specified activities
  • an employee can be assigned to any number of projects and any number of activities
  • projects and activities are independent of each other: no matter what project an employee is assigned to, the activity assignments are the same
  • a given project or activity can have any number of employees assigned to it

This means that the relationships between employees, projects, and activities are each many-to-many (M:M) and that assignments to projects and to activities are independent.

Figure 1: Less than 4NF (A) and fully normalized (B1, B2) representations

Tables B1 and B2 are the correct, fully normalized (5NF) representation. Fully normalized designs have no redundancy due to column (a) dependencies on non-key columns (b) indirect dependencies on key columns (c) dependencies on parts of composite key columns (why?) and, in the absence of other types of redundancy, DBMS enforcement of key constraints is sufficient to ensure correctness. In other words, given a relational DBMS, with B1 and B2 you would declare keys at table definition time and would be done.

Denormalization proponents, however, would prefer one table, A, which is in 3NF, but not in 4NF. Being undernormalized, table A suffers from (highlighted) redundancy (why?) and requires an integrity constraint in addition to the (composite) key constraint to ensure correctness (why?).

The challenge is to specify the additional constraint and express it in SQL.

About the Author

Fabian Pascal has a national and international reputation as an independent database technology analyst, industry critic, consultant, author and lecturer. For more than 13 years he held various analytical, and management positions in the private and public sectors, was affiliated with Codd & Date, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology and implementation. He is co-founder and editor of Database Debunkings, a web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, with C.J. Date as senior contributor. He has contributed extensively to most trade publications, and his third book, Practical Issues in Database Management--A Guide for the Thinking Practitioner, was recently published by Addison Wesley.

For More Information

Dig Deeper on Oracle DBA jobs, training and certification

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.