So, how do we monitor and catch these things? First of all, you need good performance management tools. It is a good idea to purchase tools that manage the performance of the DBMS itself, the database objects that are created in the DBMS and the SQL and application code that is written against the databases. These are usually three different toolsets, but you can purchase them from vendors (such as BMC Software) that provide an integrated interface to each of these performance management interfaces.
Each facet of database performance management is important. To deliver performance, the DBA must be able to monitor and tune each of these components. This is easier said than done.
The system (or DBMS) component consists of the system software and hardware required for the application to provide service. This includes the computer itself, its disk subsystems, network connections and all peripherals. From a software perspective, the system includes the operating system, the file system, the DBMS itself, networking protocols and any related middleware such as transaction processors or message queues.
To deliver system performance, the DBA must have the resources to monitor, manage and optimize the performance of these disparate pieces of hardware and software. Some of the tasks required for system tuning include the proper allocation and management of memory structures (e.g., buffer pools, program cache area and so on), storage management, integration of the DBMS with other system software, proper usage of database logs and coordination of the operating system resources used by the DBMS. Additionally, the DBA must control the installation, configuration and migration of the DBMS software. If the system is not performing properly, everything that uses the system will perform poorly. In other words, a poorly performing system impacts every database application.
The second component is the database. The database stores the data that is used by the application. When the application needs to access data, it does so through the DBMS to the database of choice. If the database is not optimally organized or stored, the data it contains will be difficult or slow to access. The performance of every application that requires this data will be negatively impacted.
Over time, as data is modified and updated, the DBMS may have to move the data around within the database. Such activity causes the data to become fragmented and inefficiently ordered. The longer the database remains online and the more changes made to the data, the more inefficient database access can become. To overcome disorganized and fragmented databases, the DBA can run a reorganization utility to refresh the data and make the database efficient once again. But the key to successful reorganization is to reorganize only when the database requires it; instead, some companies over-reorganize by scheduling regular database reorganization jobs to be run whether the database is fragmented or not. This wastes valuable CPU cycles.
But reorganization is only one of many database performance tasks performed by the DBA. Others include data set placement, partitioning for parallel access, managing free space and assuring optimal compression.
The third, and final, component of database performance is the application itself. Indeed, as much as 80% of all database performance problems are caused by inefficient application code. The application code consists of two parts: the SQL code and the host language code in which the SQL is embedded.
SQL is simple to learn and easy to start using. But SQL tuning and optimization is an art that takes years to master. Every DBMS provides a method of inspecting the actual access paths that will be used to satisfy SQL requests. The DBA must be an expert at understanding the different types of access paths, as well as which ones are best in which situation. Furthermore, the DBA must be able to interpret the output of the access path explanation produced by the DBMS, since it is often encoded and cryptic.
Host language code refers to the application programs written in C, COBOL, Java, Visual Basic or the programming language du jour. It is quite possible to have finely tuned SQL embedded inside of inefficient host language code. And, of course, that would cause a performance problem.
Additionally and in conclusion, make sure you have a firm definition of the term "database performance." Five factors influence database performance: workload, throughput, resources, optimization and contention.
The workload that is requested of the DBMS defines the demand. It is a combination of online transactions, batch jobs, ad hoc queries, data warehousing analysis and system commands directed through the system at any given time. Workload can fluctuate drastically from day to day, hour to hour and even minute to minute. Sometimes workload can be predicted (such as heavy month-end processing of payroll or very light access after 7:00 p.m., when most users have left for the day), but at other times it is unpredictable. The overall workload has a major impact on database performance.
Throughput defines the overall capability of the computer to process data. It is a composite of I/O speed, CPU speed, parallel capabilities of the machine and the efficiency of the operating system and system software. The hardware and software tools at the disposal of the system are known as the resources of the system. Examples include database kernel, disk space, cache controllers and microcode.
The fourth defining element of database performance is optimization. All types of systems can be optimized, but relational databases are unique in that query optimization is primarily accomplished internal to the DBMS. However, there are many other factors that need to be optimized (SQL formulation, database parameters, database organization and so on) to enable the database optimizer to create the most efficient access paths.
When the demand (workload) for a particular resource is high, contention can result. Contention is the condition in which two or more components of the workload are attempting to use a single resource in a conflicting way (for example, dual updates to the same piece of data). As contention increases, throughput decreases.
Therefore, database performance can be defined as the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.
Finally, I do not advocate managing database performance in a vacuum. In addition, applications and other system software regularly communicate with database systems. All components of the IT infrastructure must be factored into the performance planning of your organization. But it is wise to place limits on the actual responsibility for tuning outside the scope of this definition. If it is not defined above, it probably requires expertise outside the scope of database administration. Therefore, performance management tasks not covered by the above description should be handled by someone other than the DBA -- or at minimum shared between the DBA and other technicians.
This was first published in February 2005