[Continued from Part I]
This is Part II of a critique of the regressive trend to application-based data management originating in the Java/object programming world, using quotes from a representative trade article called "How to Store Java Objects."
We describe implementations of persistent Java objects in various database management systems (Oracle 7.2 RDBMS, Poet 5.0 ODBMS, and ObjectStore 2.0 ODBMS). Because technical requirements and constraints are highly domain- and environment-dependent, there is clearly no single persistence strategy that would work equally well for all applications.
If by "persistence strategy" they mean the database technology/product to be employed, the authors expose one of the fundamental flaws of object data management. The whole point of databases is to be application-neutral, in order to serve multiple applications well; DBMSs are general purpose. Object databases, on the other hand, are application-biased, a sort of contradiction in terms. Products sold as ODBMSs are, in fact, application-specific "DBMS building kits", so to speak (Oracle is, of course, a SQL DBMS, not a true RDBMS).
Connecting Java applications to relational database systems requires much more work; enabling access to an RDBMS is only the first. The second step is the mapping of objects to relational schema. This can be accomplished with available commercial tools or with a development of mapping mechanisms between the object model and a relational model. Because our goal was not only to determine performance degradation, but also to evaluate the efforts required to complete the step, we have implemented the mappings ourselves ... A simplified object model is presented in Figure 1 ... The object model in Figure 1 presents only the most important attributes and methods of classes for understanding the problem ...

Figure 1 validates my contention that object orientation fails to provide a data model analogous to the relational model and, thus, is inadequate for data management. Clearly, the "object model" is nothing but an enterprise-specific representation, not a general data model. I dare object proponents to explain how the model in Figure 1 is different than what a conceptual model would be if the relational approach were used, and what would be the "mapping" the authors refer to. Labeling things objects and methods (instead of entities and operations) adds nothing of value (except, perhaps, more confusion: the object world fails to distinguish between an object type/class and an object -- an instance of that type -- which is fundamental (see Date's and Darwen's The third manifesto.)
A true RDBMS with full support of domains (as defined in Chapter 1 of my book) and an object-oriented programming language (OOPL) -- be it Java or other -- compatible with it, would reduce, not increase the effort.
Making an object persistent using JDBC/RDBMS requires much more development work. The first, and the most difficult, step is to map Java objects to relational tables using available SQL data types. This process could be very simple if we had to implement a class that has no inheritance and includes only simple data types (e.g., strings, integers, etc.). However, if a class includes inheritance and complex Java data types (e.g., hashtables, vectors, etc.) the process becomes quite difficult.
Here the fuzziness of, and confusions in object thinking become rampant. SQL data types are "strings, integers, etc" -- clearly logical. The equivalents in an object environment would be object types or classes. But the Java data types referred to are "hashtables, vectors, etc." -- clearly physical constructs! The logical-physical confusion raises its ugly head again: mapping "Java objects to relational tables" is not a logical endeavor, but rather the exposure of physical implementation details to users, a loss of data independence which brings nothing but trouble (incidentally, while object classes encapsulate, relational tables do not, so referring to mapping the former to the latter is misleading).
In our example, making class Payment persistent requires resolution of inheritance in the relational schema. There are at least two possible solutions to the general mapping issue (to store the entire class in a single table and have a new table for each new class or to develop a hierarchy in the database and store class attributes in more than one table-for each level of inheritance we have a new table), but each requires ease-of-development or performance trade-offs depending on different contexts. For our demonstration we decided to collapse multiple inheritance levels into a single table for each class. The main reason for this design decision is to minimize the database interaction required. Rather than having to retrieve pieces of information from multiple tables, every persistent class in our system is represented by a single primary database table. The main drawback of this decision is that each class must store all the data members it contains plus all the attributes of parent classes. For example, all the objects displayed in a CreditCardPayment derive from the Payment class. Among other things, the base Payment class provides a private data member amount, which simply stores the amount of payment. Using our table, mapping the amount field must be duplicated in different tables used to store all the classes derived from the Payment class. In some cases, additional tables would be necessary to support complex Java objects. In our example, we have simplified the system in a way that no support was needed for complex objects.
There is nothing much to say about this long and tedious paragraph, except to point out how the absence of formal objective design guidelines (such as normalization principles) in the object world makes database design an ad-hoc, arbitrary endeavor, contaminated by physical considerations.
Object and relational technologies are very distinctive. So we have chosen to test performance on the object model level.
If the "object model" is not a physical model (Figure 1 establishes that), how can performance be tested "at the object level"? Besides, rushing to performance tests without any consideration of database functions is perplexing, to say the least).
Building state-of-the-art applications often require that cutting-edge technology be used. The selection of a database system on which to build an organization's applications is a complex process in which diverse and sometimes competing sets of business and technology criteria must be satisfied. Making the right choice is ultimately a strategic business decision that has a profound effect on the near- and long-term success of any project ... Our tests confirmed that only by using object database systems can the full potential of object paradigm be exploited. We exposed performance degradation of relational systems in the case of complex data models.
It is precisely because such decisions have profound effects that they should not be made, as they so often are, without a good grasp of data fundamentals. ODBMSs are not "cutting-edge technology" by any stretch of the imagination. OO has been around for as as long as relational technology and ODBMSs were hyped as early as 1993. To quote "respected analysts": "Although it is now certain that the next generation of databases will be object databases, we cannot predict with any confidence which the dominant products will be ... one thing we can be sure of: They won't be relational at the physical level (sic)." Yet hype notwithstanding, ODBMSs fizzled in the market place. In part that is because they are in a certain sense regressive, a throwback to data management with poor data independence and even pointers.
There is nothing in the relational model to limit the kinds and complexity of data types supported; in fact, this is an aspect orthogonal to (independent of) the data model employed. Lack of such support by SQL DBMSs is not a relational weakness: it is due to SQL's failure to implement relational domains. As I explain in Chapter 1 of my Practical issues in database management, inherent in such support are some nontrivial (to put it mildly) implementation complications that no DBMS, ODBMSs included, has any superiority addressing.
To quote from "The third manifesto," "... we acknowledge the desirability of supporting certain features that are commonly regarded as apects of object orientation ... the relational model needs no extension, no correction, no subsumption -- and above all, no perversion! -- in order to support [those features]" (emphasis theirs).
About the author
Fabian Pascal has an international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for more than 15 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, IRS. He is founder and editor of Database Debunkings, a web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, where C.J. Date is a senior contributor. He has contributed extensively to most trade publications, including Database Programming and Design, DBMS, DataBased Advisor, Byte, Infoworld and Computerworld. His third book, Practical issues in database management (Addison Wesley, June 2000), serves as text for a seminar bearing the same name. He can be contacted at editor@dbdebunk.com.
For More Information