"Well, it's really a judgment call and I think a lot of experience comes into it. It's a little bit like building...
Step 2 of 2:
a shack. Say you want to build a skyscraper, and you started out building a shack and you just keep trying to add onto it. After a while you have this severe structural problem ... So there is a fallacy to the build-upon-a-simple structure approach. Sometimes you get up to three stories and you have to do some major structural changes, and I just accept that." --Wayne Ratliffe, developer of dBase
"Client Servers were a tremendous mistake. And we are sorry that we sold it to you. Instead of applications running on the desktop and data sitting on the server, everything will be Internet based. The only things running on the desktop will be a browser and a word processor. What people want is simple, inexpensive hardware that functions as a window on to the Net. The PC was ludicrously complex with stacks of manuals, helplines and IT support needed to make it function. Client server was supposed to alleviate this problem, but it was a step in the wrong direction. We are paying through the nose to be ignorant." --Larry Ellison, CEO, Oracle Corp.
On Fashion and Cookbooks
The computer industry--its database sector in particular--is much like the fashion industry: it is driven by fads. And more often than not it profits from the accelerated obsolescence on which fads are predicated. DBMS vendors (Oracle's CEO in particular) were "wrong" more than once before. But it's the users, not the vendors, who paid through the nose, because the industry, with help from the trade media, hype and obscure the lack of soundness and ensuing serious deficiencies of "new" fads, which frequently are nothing but old concepts we already discarded with different labels. The Internet and the browser are just the latest fad in a long series and are as much a panacea for information management as the PC, SQL, client/server, object orientation, "universal" and multidimensional DBMSs, data warehouses, and data mining were before them, which were preached with equal fervor.
The fact is that sound database technology and practices are prerequisites for effective and efficient information management, whether Internet-based or not. Sadly, however, the database field is in disarray, with Internet practices being not lesser, but actually worse, offenders than its predecessor panaceas. While this is, to a degree, true of computing in general, in the database field the problems are so acute that--claims to the contrary notwithstanding--knowledge and, therefore, technology are actually regressing!
Even a cursory inspection of problems encountered in real world database practice reveals that most are due to the persistent failure by both DBMS vendors and users--including DBAs, application developers, and managers--to educate themselves and rely on a sound foundation in their practices. Indeed, it is lack of proper education that makes fads and accelerating obsolescence possible in the first place! As Chris Date explains in the Foreword to my book:
"SQL [DBMS] deficiencies are, it seems to me, directly due to the widespread lack of understanding (not least on the part of vendors), of fundamental database principles. Certainly it is undeniable that they flout those principles in numerous way. And the practical consequences are all too obvious: First, users must understand where the deficiencies lie; second, they have to understand just why they are deficiencies; third, they have to understand how to work around them; and fourth, they have to devote time and effort in persuading the vendors to remedy them. The trouble is, of course, users too tend to be unaware of those same fundamental principles and, hence, find themselves unable to carry out their side of the "contract" (a "contract" that should not have been allowed, or agreed to in the first place, of course). It's a vicious cycle. What is more, this sad state of affairs is not likely to change, given the apparent lack of interest on the part of the trade press--itself ignorant of those same principles--in trying to improve matters."
Consider, for example, the following two cases, one raised by a novice:
"I need to store 40 pieces of unrelated information. Is it better to create [one] table w[ith one] record [and] 40 fields, or create [one] table w[ith] 40 records [and one] field?"
The other is raised by a consultant assessing a database constructed, supposedly, by experienced professionals:
" ... finished testing a--gasp, choke--COBOL program for a software company whose main product is a well-known government contract accounting system ... Now th[e expletive deleted] database ... is replete with repeating groups, redundant fields, etc. On top of all that, because it is one of the central files to the entire system, there are literally hundreds of rules and relationships, all of which must be enforced by the dozens of subprograms that access it. I found so many violations of so many of these rules in this new subprogram, that I filled five single-spaced pages with comments and suggestions. And I probably missed [the more obscure problems]. Several [such problems], perhaps."
They are not only indicative of how database work is approached these days and with what results, but quite representative. What should be obvious is that
- The problems involve database (not application!) issues, and fundamental issues at that.
- These issues underlie any and all databases, regardless of nature and purpose.
- The consequences of not addressing them correctly are hardly theoretical and quite severe.
- No amount of expertise in any DBMS product, development tool, on any hardware platform can, in itself, address them.
Yet it is practically impossible to get the attention of database practitioners for anything other than product-specific recipes, essentially a cookbook approach. Examples:
"I polled our [user group] membership last night about future topics. For the foreseeable future, we prefer to focus on Microsoft SQL Server 2000 topics exclusively."
"I don't disagree with your statement that the "lack of attention to database issues can cause horrendous problems". However, I've found that the members of the group also do not exclusively focus on products, but actually pay a lot of attention to the database, understanding that each affects the other. The major problem, as I see it, is that [database orient]ation ... is not what the user group is about. Yes, database design and use is definitely a part of our world, but our focus is on Sybase's development tools, such as PowerBuilder, PowerJ, Enterprise Application Server, etc."
Education vs. Training
The fad-driven, tool-focused, cookbook approach to data management is due in large part to the business culture in general, and the way in which information technology professionals are inducted into the field in particular. A vast majority is self-taught and start their database involvement via work with some specific DBMS software (e.g. Oracle, Access, SQL Server) and tools (frequently imposed on them by their employer. Having not been exposed to general database concepts, principles and methods, practitioners are either unaware of the field's fundamentals, assume that they are acquired implicitly by learning or working with the software or, most commonly, deem them "theory" and, therefore, without a practical value. These fallacious perceptions are exacerbated by a growing generation of Internet practitioners who know little beyond HTML, Java and XML (not even DBMS or tool software), and who, therefore, think that's all there is to know.
But we should not expect anything different. The sole technical qualification for practically all database positions is experience with some DBMS software and development tools (mainly programming) on specific platforms (hardware and operating systems). Nothing else. Examples:
Title: Senior Database Architect
Qualifications: Minimum of 3 years with Oracle on Solaris. Working knowledge of Tuxedo. Use of database design tools such as ER/Win. Perl and scripting. Familiarity with Oracle 8, Oracle Parallel Server, Sun Clusters, C. At least 3 years of relevant experience.
Title: Database Analyst III
Not only isn't foundational knowledge--as distinct from sheer experience with tools--a job requirement, but more often than not it is actually a liability. Functions such as requirements analysis are database logical design are bundled together with database administration, application development and physical implementation and assigned to the (mythical) position of "programmer/analyst", without realizing that they require fundamentally different skill and knowledge sets which are rarely found in one person, because they inherently interfere with one another (particularly with currently flawed DBMS products). If you wanted to build a house, would you hire a building contractor to design it?
In fact, under industry pressure there is little database education to be had. Product-specific training reigns supreme and even academic computer science programs are becoming increasingly vocational in character. Example:
"We are very interested in additional Oracle instructors, if that is something you can teach."
"Does (the course) cover accessing a database via CGI, i.e. VB, Java, Perl, C++ access to SQL Server or Access DB? We're a CS dept, so not so interested in the user-developer side of things."
An analogy can serve to drive the perils of this state of affairs home. Suppose you must select a personal physician and have two candidates: one educated in, among other things, some anatomy, biology and chemistry, and one trained in a "cookbook" approach: identifying symptoms from a list and matching treatments from another. Chances are you will opt with the majority for the former rather than the latter, and for a very good reason: in the absence of knowledge and understanding of some health fundamentals, serious problems can be expected. This is generally clear in all applied fields with a scientific foundation except, it seems, database management.
Is there any wonder that practitioners, seasoned ones included, can't offer a useful definition of a database? That neither DBMS designers, nor technically proficient users have heard of crucial concepts such as data independence? That many believe that not only should duplicates not be prohibited, but that they are actually essential?
The consequences are visible all over the business--and Internet--world and are horrendous. A vast majority of databases and practically all DBMS products are riddled with flaws and unnecessary complications. Examples:
"You might ask what is wrong? Well, it is a client/server application, using a Sybase database (SQL Anywhere). The database server has a single login user DBA--using the default password. Every application user connects to the database via this login level, and security is handled by the front end - despite the fact that any semi-aware user could use MS Access to destroy any data. There are also about 300 tables in the database, with no indexes! Agreed there are primary key indexes created automatically by the database, but still... The front end is Visual Basic, which for me is OK, but there are at least three different data access methodologies, from ODBC-API to the latest ADO. But what is killing me, is that I seem to have been hired as a "bug-fixer", to me different than an engineer. They are in a position where release schedules are forcing a continual maintenance mode, rather than an admittedly necessary rebuild of some components."
"In the short term you have two options a) disable referential integrity checking and make the change (not recommended unless you're willing to assume total responsibility for the data consistency checking yourself; and you have to ensure you have exclusive access to the database when you're doing this) b) use our [DBMS's] triggers and stored procedures to implement the referential integrity procedurally."
A Vicious Cycle
Correcting this sad state of affairs is a nontrivial proposition, because it is a deep seated, vicious cycle that is cultural and systemic and, therefore, extremely difficult to break. It is much easier (and profitable) to go with the flow, rather than uphill against it, so the vast majority of trade magazines, books, web sites, conferences, and education programs ignore fundamentals, rely exclusively on vendor sources and focus completely on product-specific "recipes", reinforcing, rather than combating the cookbook approach.
DBMS and tool vendors, database professionals and users, desire accurate answers from databases. Yet the vast majority are unaware that, as Hugh Darwen states:
- a database is a set of axioms;
- the response to a query is a theorem;
- the process of deriving the theorem from the axioms is a proof;
- a proof is made by manipulating symbols according to agreed mathematical rules
The proof, of course, can only be as sound and consistent as the rules are. That makes the DBMS a deductive logic system: it derives new facts (query results) from a set of user asserted facts (the database). The derived facts are true (query results are correct) if and only if
- The initial assertions are true
- The derivation rules are (logically) sound and consistent
Neither are most practitioners explicitly aware that the truth of the initial assertions (the correctness of the database) must be ensured by the DBMS's integrity function, and the correctness of the derivations (query executions) by its manipulation function, only if the design of both databases and DBMSs adheres to the sound and fundamental principles of logic.
Because they have been socialized into, and rewarded for discounting fundamental principles as "theory" without practical value, practitioners are largely unaware that the tools they employ and the practices they induce fail at these functions and, therefore, cannot and do not guarantee the accuracy and efficiency they desire. The result, as I have amply demonstrated and documented in my writings, is that a lot of what is being said, written, and especially done in the database management field--or whatever is left of it--is increasingly confused, irrelevant, misleading, or outright wrong.
About the Author
Fabian Pascal has a national and international reputation as an independent database technology analyst, industry critic, consultant, author and lecturer. For more than 13 years he held various analytical, and management positions in the private and public sectors, was affiliated with Codd & Date, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology and implementation. He is co-founder and editor of Database Debunkings, a web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, with C.J. Date as senior contributor. He has contributed extensively to most trade publications, and his third book, Practical Issues in Database Management--A Guide for the Thinking Practitioner, was recently published by Addison Wesley.
For More Information
- What do you think about this tip? E-mail us at editor@searchDatabase.com with your feedback.
- The Best Database Design Web Links: tips, tutorials, scripts, and more.
- Have a Database Design tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
- Ask your technical Database Design questions--or help out your peers by answering them--in our live discussion forums.
- Ask the Experts yourself: Our Database Design gurus are waiting to answer your toughest questions.