Continued from Part I.
At Database Debunkings I recently posted an exchange on whether or not MySQL is a DBMS, let alone relational, and invited readers to test themselves on their ability to see through the prevailing fallacies and misconceptions, including those by designers of DBMS products. I am debunking the exchange in this month's column. In what follows, HT is Heikki Tuuri, author of InnoBase, a product he refers to as InnoDB and describes as a "back-end which provides transactions to MySQL" (a concept which, by itself, reveals a misconception, can you figure it out?). LF is Lee Fesperman.
DW: MySQL doesn't offer views, unless there has been an update that I'm not aware of.
Quite interestingly, there was no follow-up to this comment on views. Views, or virtual tables, are the relational mechanism for logical data independence, which relieves users from the burden of modifying applications when databases undergo nonloss changes at the logical level -- a critical relational benefit.
As practitioners can't fail noticing, SQL DBMSs do not permit updates of certain types of views, the most prominent type being multitable views, of which join-views are one example. This restriction is due to SQL products' inability to guarantee the correct propagation of such view updates to the underlying base tables. This limitation is due, in turn, directly to SQL's initial lack of support for database integrity in general, and keys in particular (the very approach taken by MySQL and InnoDB!). SQL DBMSs cannot, therefore, support integrity inheritance -- the ability to infer constraints on derived tables such as views. Derived constraints on views are inherited and can be inferred from (a) the constraints on the base tables underlying the views and (b) the table operations used to derive the views. Since SQL did not support base constraints, it could not infer derived constraints on views. The practical implication is, of course, that base tables must be updated directly, rather than via the views and, therefore, applications accessing them must also be changed whenever the base tables change, a loss of logical data independence.
The Point: Full support of views is an essential relational requirement, without which logical data independence and the practical benefits it confers fail to materialize. View updatability depends on another essential relational requirement -- database integrity.
HT: This is semantics, but I think Codd and Date in the 70's meant by a relational database something which has a relational query language. Thus any database which has a query language somewhat similar to the relational algebra or SQL can be called a relational database.
LF: That may be what you think, but it has no basis in reality. MySQL is not a DBMS; it is not relational and the query language it supports is not even 'somewhat' similar to SQL.
HT: What do you demand of a DBMS? Is IBM's IMS a DBMS? Is Oracle 9i a DBMS? I found the following definition:
relational data model (or "relational model"): A data model introduced by E.F. Codd in 1970, particularly well suited for business data management. In this model, data are organized in tables. The set of names of the columns is called the "schema" of the table. Here is an example table with the schema (account number, amount) and 3 lines.account number amount -------------- --------- 12343243546456 +30000.00 23149875245824 +2345.33 18479827492874 -123.25
The data can be manipulated using a [sic] relational algebra. SQL is a standard language for talking to a database built on the relational model (a "relational database"). ["A relational model for large shared data banks" Communications of ACM 136, pp 377-387].
So, obviously MySQL is a relational database. If not, what is it? An operating system?
LF: Yes, IMS is a DBMS (though not relational at all, but hierarchic). Oracle 9i is a DBMS (albeit overgrown). Oracle is not really relational. I might call Oracle semi-relational (recognizing that as an oxymoron); Date would call Oracle a 'SQL DBMS.' That is an abbreviated definition for college students. Someone creating a DBMS shouldn't rely on that for the design. You should at least do some reading of Codd and Date. Not so obviously. I do agree the term has been used loosely, for instance, by the dBASE camp. How about an application program?
HT: I have read at least one book by Date. MySQL is a 'relational DBMS' because (1) It has a query language which covers most of the basic relational algebra, and also some extensions like aggregate functions. More complex operations you can do with very fast temporary tables. This covers the word 'relational' (2) It offers transactions, rollback, crash recovery, security, backup utilities, front-ends, etc. It will offer referential integrity constraints in September. This covers the acronym 'DBMS.' MySQL is not an 'active' database because it does not currently offer stored procedures or triggers. Sinisa from MySQL AB wrote to the mailing list lately that 'we intend to support' stored procedures by the end of year.
This part of the exchange speaks for itself and requires little comment. To reiterate and conclude:
- A relational data language is but one requirement for a RDBMS. There are database requirements (support of some sound and complete data model, security, recovery, concurrency control and so on), in other words to be a RDBMS, a product must be a DBMS in the first place. Then there are relational requirements (domains, R-tables, R-integrity, set-level R-operations, data independence); the data language is but a concretization of these features, of which the data model consists. The definition of the relational data model above ignores most of them, and should not be imparted to students. But then, as I demonstrate in my writings, neither schools nor universities know data fundamentals any better and are too busy certifying for vendors, instead of teaching concepts and principles, which is their true function (see, for example, The Myth of Market-based Education. Is there any wonder, under these circumstances, that misconceptions abound and persist?
- IMS is a DBMS, but a hierarchic, not a relational one. Oracle is not a truly and fully relational DBMS, only a SQL one. MySQL is essentially a file manager.
- DBMS designers should educate themselves on database fundamentals and relational concepts and should be concerned with the deficiencies of flouting them, rather than succumb to superficial and misleading marketing claims.
About the author
Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for more than 15 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, IRS. He is founder and editor of Database Debunkings, a web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, where C.J. Date is a senior contributor. He has contributed extensively to most trade publications, including Database Programming and Design, DBMS, DataBased Advisor, Byte, Infoworld and Computerworld. His third book, "Practical issues in database management" (Addison Wesley, June 2000), serves as text for a seminar bearing the same name. He can be contacted at email@example.com.
For More Information
- What do you think about this column? E-mail the Editor at firstname.lastname@example.org with your feedback.
- The Best Web Links on the relational model
- Post your technical database questions--or help out your peers by answering them--in our live discussion forums.
- Ask the Experts! Our database design, SQL, Oracle, DB2, SQL Server, metadata, and data warehousing gurus will answer your toughest questions.