Last time, we described the most commonly implemented type of database: those based on the relational model. However,...
all DBA's should know about the five alternatives that exist, as reviewed in this excerpt from Ryan Stephens & Ronald Plew's new book Teach Yourself Database Design (Sams, 2000):
Flat-File Database Model
Before vendors such as Oracle and Microsoft started developing database management systems, many companies stored their data in flat files on a host computer. The use of flat files to store data was predominant in the mainframe era. A flat-file database consists of one or more readable files, normally stored in a text format. Information in these files is stored as fields, the fields having either a constant length or a variable length separated by some character (delimiter).
Following is an example of a flat file with constant length fields:
|1234||Ernest Hemingway||For Whom the Bell Tolls|
|5678||Charles Dickens||Great Expectations|
|4321||Ernest Hemingway||A Farewell to Arms|
|4523||Jack London||Call of the Wild|
|3456||Mark Twain||Adventures of Huckleberry Finn|
In this example, there are obviously three fields: an identification number, the author name, and the title of the book. Each field has a constant length because the identification number always starts in column #1 and ends in column #4, the author name starts in column #6 and ends in column #25, and so on.
The following is an example of a flat file, with variable length fields separated by a given delimiter:
1234:Ernest Hemingway:For Whom the Bell Tolls 5678:Charles Dickens:Great Expectations 4321:Ernest Hemingway:A Farewell to Arms 8765:Jack London:White Fang 4523:Jack London:Call of the Wild 3456:Mark Twain:Adventures of Huckleberry Finn
There are also three fields in this example. Each field is separated by a colon. The fields are not a constant length in the example. When using field separators, you should make sure that the field separator is not a character that can be found in the data.
Note - Sometimes flat files are created and used to migrate data from one database implementation to another, particularly with relational databases.
Every flat-file database system is different because companies store different data and companies have different needs. After a flat-file system has been created and the data has been stored in these files, a method must be devised in order to retrieve the data, create new records, update records, or delete records. For instance, if you wanted to get a list of each one of the titles authored by Jack London, you would have to find every record that had an occurrence of "Jack London". In addition to finding each record with the string "Jack London" in the second field, you would want to filter the data so that only the title of the book, which is the third field, is retrieved.
The problem of accessing the data requires a collection of programs to be written that access the information stored in the flat files—unless you expect the user or customer to search the flat files themselves in order to access the data, which would be quite unacceptable. One of the main problems with a flat-file database system is that not only do you have to understand the structure of the files, but also you must know exactly where data is physically stored. Additionally, your database will probably require numerous flat files, which might have data related to other data stored in some other file. Managing data relationships is a difficult task in the flat-file database environment.
The following is an overview of the drawbacks of a flat-file database:
- Flat files do not promote a structure in which data can easily be related.
- It is difficult to manage data effectively and to ensure accuracy.
- It is usually necessary to store redundant data, which causes more work to accurately maintain the data.
- The physical location of the data field within the file must be known.
- A program must be developed to manage the data.
Hierarchical Database Model
A hierarchical database is a step above that of a flat-file database, mainly because of the ability to establish and maintain relationships between groups of data. The architecture of a hierarchical database is based on the concept of parent/child relationships. In a hierarchical database, a root table, or parent table, resides at the top of the structure, which points to child tables containing related data. The structure of a hierarchical database model appears as an inverted tree, as shown in this figure:
In this figure, Publishers is the root table. Publishers has two child tables: Authors and Bookstores. A publisher has many authors to whom it contracts, as well as many bookstores to which it supplies books. Authors is a parent table to the Titles table, as Bookstores is to Inventory. Titles is a child table of Authors, as Inventory and Orders are child tables of Bookstores. One of the problems with this hierarchical layout is that redundant book title information would have to be stored in the Inventory table because there is no direct association between Authors and Bookstores.
Note - As previously mentioned, it is a good practice to store as little redundant data in a database as possible. One of the implications of having redundant data is that data will have to be modified multiple times, depending on the level of redundancy. Processes must be in place to ensure that all appropriate data is updated when necessary to avoid inconsistent data in the database.
A parent table can have many child tables, but a child table can have only one parent table. To get to a child table, the parent table must first be accessed. Related tables within the hierarchical structure are linked by a pointer, which points to the physical location of a child record.
Benefits of the hierarchical model over the flat-file model:
Data can be quickly retrieved.
Data integrity is easier to manage.
Drawbacks of the hierarchical model:
- Users must be very familiar with the database structure.
- Redundant data is stored.
Network Database Model
Improvements were made to the hierarchical database model in order to derive the network model. As in the hierarchical model, tables are related to one another. One of the main advantages of the network model is the capability of parent tables to share relationships with child tables. This means that a child table can have multiple parent tables. Additionally, a user can access data by starting with any table in the structure, navigating either up or down in the tree. The user is not required to access a root table first to get to child tables.
The relationship between tables in the network model is called a set structure, where one table is the owner and another table is a member. This is the same basic concept as the parent/child relationship discussed earlier. Set structures can represent a one-to-many relationship between tables. Application programs that access the network database use set structures to navigate to different parts of the database; therefore if a set structure is modified, application programs that access the database must also be modified. This Figure illustrates set structures:
In this figure, the Publisher table owns two tables: Authors and Bookstores. Authors and Bookstores are each members of the Publisher table. Publishers contract work to authors and supply finished books to bookstores. The set structure between Authors and Publishers is called Contract.
This Figure illustrates how child tables, or members, can be shared by parent tables:
In this figure, the Titles table is owned by both Authors and Bookstores. Both Authors and Bookstores require a relationship with Titles. Although two set structures can be used to access the Titles table, book title information is only stored in one table, thus reducing data redundancy.
The benefits of the network database model are as follows:
- Data is accessed very quickly.
- Users can access data starting with any table.
- It is easier to model more complex databases.
- It is easier to develop complex queries to retrieve data.
The drawbacks of the network database model are as follows:
- The structure of the database is not easily modified.
- Changes to the database structure definitely affect application programs that access the database.
- The user has to understand the structure of the database.
Object-Oriented (OO) Database Model
During the last few years, object-oriented programming has become popular with languages such as C++, Visual Basic, and Java. An OO programming language allows the programmer to work with objects to define an application that interacts with a relational database (since most companies now use the relational database model). For example, elements within a program or database application are visually represented as objects. These objects have assigned properties, which can be modified, and can also be inherited from other objects. Related types of objects are assigned various properties that can be adjusted to define the particular object and determine how the object will act. With these OO programming tools, applications are now easier to develop and maintain. Many mundane programming tasks can be automated by an OO programming tool, thus reducing the amount of time it takes to develop an application, increasing overall productivity.
A problem with the relational database as OO programming technology advances is that developers must understand both the relational database language (SQL) as well as the OO programming language (Java, for example) that is to be used to design the application. It is important for developers to understand relational database concepts in order for the application to access the data. It can be confusing for the developer to switch modes of thinking between relational and OO.
An object-oriented database is a database in which data can be defined, stored, and accessed using an OO programming approach. For an OO database, a select OO programming language is used to define the structure of the database as well as create an application through which to interact with the database.
Note - The object query language is used to manage data in an OO database. The object query language, OQL, is based on the standard relational language SQL, but has additional features that allow data to be stored and accessed as objects with properties.
This Figure illustrates an example of an OO database and its implementation.
The figure shows that the OO database is defined using an OO programming language such as Java. The end-user application is also created using the OO language. The OO code is compiled, and a compatible database and application are produced. An object database management system is used to link the database and the application.
The two basic structures in an OO database are as follows:
Objects are structures that have identifiers through which an object can be associated with other objects. Literals are values associated with objects, and have no identifiers. Objects and literals are organized by types, where all elements of a given type have the same set of properties, which can be modified for each individual object. A class is the equivalent of a table in a relational database. Operations are used to retrieve values from other classes, to add values, and to remove values. This Figure illustrates how data is related in an OO database:
Benefits of the object-oriented model are as follows:
- The programmer need only understand OO concepts as opposed to the combination of OO concepts and relational database storage.
- Objects can inherit property settings from other objects.
- Much of the application program process is automated.
- It is theoretically easier to manage objects.
- OO data model is more compatible with OO programming tools.
Drawbacks of the object-oriented model are as follows:
- Users must learn OO concepts because the OO database does not work with traditional programming methods.
- Standards have not been completely established for the evolving database model.
- Stability is a concern because OO databases have not been around for long.
Object-Relational (OR) Database Model
Although some rough seams exist between the object-oriented and relational models, the object-relational model was developed with the objective of combining the concepts of the relational database model with object-oriented programming style. The OR model is supposed to represent the best of both worlds (relational and OO), although the OR model is still early in development. As we speak, vendors are implementing OR concepts into their relational databases, as the International Standards Organization (ISO) has integrated OR concepts into the new SQL standard, referred to as SQL3. SQL3 is also referred to as SQL99.
This Figure illustrates an example OR implementation in the Oracle8 relational database management system (RDBMS):
Two user-defined types have been created: PERSON and ADDRESS. Each type has columns that define specific data for a column in the base table, providing a 3D effect for the data. For example, the EMP_INFO column in the EMP table has a type of PERSON. PERSON is broken down into the specific categories LAST_NAME, FIRST_NAME, MID_INIT, and SSN.
- The relational database has more of a 3D architecture.
- User-defined types can be created.
Drawbacks of the object-relational model are as follows:
- The user must understand both object-oriented and relational concepts.
- Some vendors that have implemented OR concepts do not support object inheritance.
For More Information
- You can purchase Teach Yourself Database Design (Sams, 2000) here.
- The Best Database Design Web Links: tips, tutorials, scripts, and more.
- Have a database design to offer your fellow DBA's? The best tips submitted will receive a cool prize--submit your tip today!
- Ask your technical database design questions--or help out your peers by answering them--in our live discussion forums.
- Check out our new Ask the Experts feature: Our database design gurus are waiting to answer your toughest questions.