The definition of a good database is relative to the requirements of each customer because every situation is different....
A good database is determined as seen through the eyes of the customer, the end user, the database administration team, and management. If all parties are happy with the database, the allocation of resources to design a new database might be unnecessary. On the other hand, if some of the parties involved are unhappy about a few things, it might be worthwhile to begin designing a new database. Keep in mind that there might not be an existing database in place. If a database is not currently in use, it is still important to understand the key principles of a "good" database before thinking about designing one.
Although there are many hallmarks of a good database, this excerpt from Teach Yourself Database Design (Sams, 2000) discusses the most common, including:
More on Oracle databases
Learn about Oracle can help with database consolidation
Read up on how to get better database performance with Oracle
Understand whether SAD databases can compete with Oracle
- Data storage needs having been met
- Data is readily available to the end-user
- Data being protected through database security
- Data being accurate and easy to manage
- Overall database performance being acceptable
- Having a minimized amount of redundant data stored
Storage Needs Met
The foremost objective of a database is to store data. In order to determine if data storage is adequate, the following questions might be of use:
- Have all storage needs been met for the database?
- Has all data been stored effectively?
- Is the database model used capable of handling the complexity of business relationships?
- Is the database model used capable of handling the estimated volume of data for the proposed database?
- Is the hardware adequate for storage needs?
- Does the database software meet the storage needs?
- What data is stored offline as opposed to online storage?
- How easy is it to access offline data storage?
- Has all unnecessary online data been purged or archived into an offline storage device?
Some of the factors that affect data storage include the design of the database, the database software, the hardware and operating system on which the database runs, and the types of data stored. As you will learn throughout the book, it is imperative to carefully plan the design of the database so that all storage needs are met. A database software program must be used to implement a database that has been designed. Some vendors provide features that others do not with their database software. Although the hardware and disk space on which the database runs may have been adequate initially, the database may have grown beyond the physical limits of the hardware. Finally, the type of data stored must be considered. Online data is data that is readily available to the end user and is stored in the database. Offline data is data that is archived and not stored in the database. If no data is ever archived, it can appear that storage needs are not being met.
Data Is Available
What good is a database if data is not readily available to the end-user? Data must be available as requested by the end-user, during all hours of business operation. Many organizations are referred to as 24X7 shops. This means that data must be available twenty-four hours a day, seven days a week, to satisfy the needs of many groups of users that have a need to potentially access the database from many sites in different time zones around the world. Database down time is often necessary, but must be scheduled around hours of peak user activity. Database down time should not occur if it has not been scheduled. However, down time sometimes occurs from time to time based on factors that are related to the design of the database. If data is not available when expected, then the database is not fulfilling its purpose.
Data availability is also related to the user's expected and perceived performance of the database. Consider this: A user starts an application and performs some function that requests data. Depending on the nature of the request, it may take several minutes or seconds for data to become available, or it might appear instantly on the screen. In the production environment, perception is reality, and if a user feels an application or database is slow, it will create the perception of poor performance. A database environment may be simply stellar in its design and implementation, but expectations might overtake the possible realistic performance of the database and application. In some cases, small changes to the application may help, assuming the database itself has been well-designed. For example, an OLE database call into a Visual Basic record set object may take several seconds. When the screen is painted, an additional few seconds is required. Suppose the screen is painted while the record set is being retrieved from the database. The few seconds it takes to paint the screen so that the screen from the application and the data from the database pop together may be worth it. In this example, the users perceived the database to be slow. A simple change in perception solved the issue.
Data Is Protected
After the data is stored in the database, it is important to ensure that the data is well protected. Database security should be established to protect the data from unauthorized users. Some questions that might be asked to determine how well the data is protected include the following:
- Does security exist in the database?
- Is the data protected from outside users?
- Is the data protected from internal users?
- How easy is it for unauthorized users to access the data?
- Have there been any security breaches since the database implementation?
- How easy is it to limit the access to various groups of users within the scope of the database?
- How easy is it to grant and revoke data access to various groups of users?
Without database security, the database can easily become corrupted, whether intentional or not. It is important to restrict access to the database from individuals not requiring access. As a general rule, you should be able to limit the access to the data at a very low level. Many databases are designed with little or no security. Although a lack of security does not justify the complete redesign of a database, it is a definite setback and can cause problems that leave decision makers with the impression that they have a poor database.
Data Is Accurate
Suppose that storage is sufficient and security is established. Security will stop unauthorized users from entering the database, but what is there to protect the data from users with access? Mainly, constraints should be used to control the allowed data. Constraints are normally applied at the column level, verifying data entered for a particular column. Constraints, however, cannot completely eliminate the chance for inaccurate data. It is the main responsibility of the end user to properly manage the data that is entered.
The following questions will help determine if precautions have been taken to ensure that data is as accurate as possible:
- Has referential integrity been applied (primary key and foreign key constraints)?
- What other constraints have been established to check the uniqueness or validity of data?
- Are data relationships easily maintained within the database?
- How easy is it for the end user to enter invalid data into the database?
- Are there edits and constraints in the application interface that supplement those in the database?
- Have code tables been established for common data?
Is data consistent within the database? Consistency is related, but not the same as accurate data. Users should be forced to enter data in a consistent manner. A database becomes corrupted if the end user is allowed to enter data in a fashion that is inconsistent with the way other users are entering data. This affects the integrity of the data.
Performance is a major issue for any database. All parties are worried about performance, from the end user to the database administrator to management. If the end user is not happy, the customer is not happy because the job is not getting done. If the customer is not happy, management is not happy. When management is not happy, nobody is happysort of like "when Mom isn't happy, nobody's happy." The following questions can be raised to determine whether database performance is acceptable:
- What is the expected response time for transactions and small queries?
- How does the database perform overall according to the end user?
- How does the database perform during high peak times of transactional activity?
- How does the database perform during batch operations, such as massive data loads and queries against large amounts of data?
- If performance problems exist, are these problems related to the design of the database, the application interface, network problems, or hardware?
There is often a grand difference between performance as expected by the end user versus realistic performance. Users often expect instantaneous response time. Usually, an acceptable response time to retrieve a record from the database or to perform a relatively small transaction is somewhere between one and five seconds. This type of response time is often obtainable, depending on factors such as the design of the database, the design of the application, the speed of the hardware, and the speed of the network. If performance is not acceptable to the customer and end user, it is important to evaluate performance carefully to identify the point of degradation. Also, keep in mind that more time must be allowed for batch loads or queries against large amounts of data because more data is being processed.
Redundant Data Is Minimized
One of the main goals when storing data in a database is to reduce or eliminate redundant information. Data should be stored one time in the database if possible. If an occurrence of data is stored multiple times in the database, the data must be updated in many places when changes are required. Likewise, the data might have to be added or removed in multiple places. Redundant data can lead to both inconsistent and inaccurate data. Inaccurate data is caused in the sense that the data might be wrong altogether when entered in a duplicate location. Inconsistent data might be derived in the sense that the data might be entered in different ways in different locations.
For example, if an address is stored in multiple locations, the chances increase for an end user to enter data inconsistently according to other data stored in the database.
address entered as "123 Elm Street"
address entered as "123 ELM ST."
There are two things to consider in this example. First, street was abbreviated in the second location. Second, the entire address was entered in uppercase in the second location, where it was entered in mixed case in the first location. With these two simple inconsistencies, it would be more difficult to compare the two values than if the address was stored only one time in the database. Data inconsistency because of redundant data can cause great confusion to the end user when retrieving and modifying data; thus, a major concern when designing a database. Redundant data is minimized through a process called normalization.
For More Information
- You can purchase Teach Yourself Database Design (Sams, 2000) here.
- What do you think about this tip? E-mail us at [email protected] with your feedback.
- Have a database design to offer your fellow DBA's? The best tips submitted will receive a cool prize--submit your tip today!
- Ask your technical database design questions--or help out your peers by answering them--in our live discussion forums.
- Check out our Ask the Experts feature: Our database design gurus are waiting to answer your toughest questions.