The following is the sixth part of a 12-part series on Oracle10g CBO internals and SQL tuning optimization. Each tip is excerpted from the not-yet-released Rampant TechPress book, "Oracle SQL and index internals," by Kimberly Floss. Check back to the main series page for upcoming installments.
Know thy database
When writing efficient SQL, it is imperative to know the demographics and shape of the objects your code will bump up against. For most databases, all the information you will ever need can be found in the data dictionary. But when querying the dictionary for object statistics, make sure you are looking at accurate information. While some databases such as SQL Server 2000 have automatic updating of object statistics into the dictionary, other RDBMS engines require you to manually (and periodically) refresh the database catalog with up-to-date object data. Fortunately, most make it pretty easy to accomplish. Oracle, for example, now offers packages to assist with the updating of objects. The dbms_utility package contains several procedures to help database professionals update their schema objects. To update the dictionary for a single schema, you can use the dbms_utility.analyze_schema procedure. Recently introduced is the dbms_utility.analyze_database. Just be careful when executing such a procedure up against a monolithic database like Oracle's applications!
However you choose to update your objects, you should make a practice of keeping data in the dictionary current, especially for databases that are very dynamic in nature. Scheduling the object updates in a nightly maintenance plan for such a database would likely be a good thing to do, especially if you use a cost-based optimizer in your RDBMS. Fresh statistics help the optimizer make more informed choices of what path to follow when routing queries to the data. Obviously, if your database thinks you only have 100 rows in a table that actually contains a million, the map used by the optimizer might not be the right one and your response times will show it.
So let's say you know (at least now) to keep your object statistics up to date. When tuning SQL, what are some things you should be looking for with respect to the objects used in your query to help make the right coding choices? Although this list is certainly not exhaustive, for tables you can start by eyeballing these items:
- Row counts. No heavy explanation needed for why you should be looking at this statistic. You will want to avoid full scans on beefy tables with large row counts. Proper index placement becomes quite important on such tables. Other reasons for reviewing row counts include physical redesign decisions. Perhaps a table has grown larger than anticipated and is now eligible for partitioning? Scanning a single partition in a table is a lot less work than running through the entire table.
- Chained row counts. Row chaining and migration can be a thorn in the side of an otherwise wellwritten SQL statement. Chained rows are usually the result of invalid page or block size choices (rows for a wide table will not fit on a single page or block). Migration is caused when a row expands beyond the remaining size boundary of the original block it was placed into. The database is forced to move the row to another block and leaves a pointer behind to indicate its new location. While chaining and migration are different, they have one thing in common: extra I/O is needed to retrieve the row that is either chained or migrated. Knowing how many chained or migrated rows your table has can help you determine if an object reorganization (or in extreme cases, a database rebuild with a larger block size) is needed.
- Space extents. For some databases, objects that have moved into multiple extents can be slower to access than same-size objects that are contained within a single contiguous extent of space. Later versions of Oracle, however, do not suffer from this multi-extent problem anymore, especially when objects are placed into new, locally managed tablespaces.
- High water marks. Tables that experience significant insert and delete activity can be special problem children. Oracle will always scan up to a table's "high water mark," which is the last block of space it "thinks" there is data. For example, a table that previously contained a million rows but now only has 100 rows may be scanned like it still has a million. You can determine if you need to reload a table (usually done by a reorg or truncate and load) by checking the high water marks of tables to see if they still are set to abnormally high values.
- Miscellaneous properties. There are several other performance-boosting properties that you may want to set for tables. For instance, large tables that are being scanned may benefit from having parallelism enabled so the table can be scanned (hopefully) much quicker. Small lookup tables may benefit from being permanently cached in memory to speed access times. In Oracle, this can be done by placing them into the KEEP buffer pool. The cache parameter can also be used, although it is not as permanent a fix as the KEEP buffer pool option.
- Selectivity/unique keys. Indexes by their nature normally work best when selectivity is high; in other words, the numbers of unique values are many. The exception to this rule is the bitmap index, which is designed to work on columns with very low cardinality (such as a Yes/No column). The selectivity of indexes should be examined periodically to see if an index that previously contained many unique values is now one that is losing its uniqueness rank. Complete unique-only indexes, of course, will not have this problem.
- Depth. The tree depth of an index will tell you if the index has undergone a lot of splits and other destructive activity. Typically, indexes with tree depths greater than three or four are good candidates for rebuilds, an activity that hopefully will improve access speed.
- Deleted row counts. Indexes that suffer from high parent table maintenance may contain a lot of "dead air" in the form of deleted rows in the leaf pages. Again, a rebuild may be in order for indexes with high counts of deleted leaf rows.
Understanding the current state and shape of the objects being used in the queries that you are trying to tune can unlock clues as to how you may want to restructure your SQL code. For example, you may realize that you have not indexed critical foreign keys that are used over and over again in various sets of join operations. Or, you might spot that your millionrow table is a perfect candidate for a bitmap index given the current WHERE predicate.
About the author
Kimberly Floss is one of the most-respected Oracle database administrators in the U.S., and is president of the International Oracle Users Group (IOUG). With more than a decade of experience, Kimberly specializes in Oracle performance tuning and is a respected expert in SQL tuning techniques. She is an active member of the Chicago Oracle Users Group, and the Midwest Oracle Users Group, in addition to the IOUG. Kimberly Floss has over 15 years of experience in the information technology industry, with specific focus on relational database technology, including Oracle, DB2, Microsoft SQL Server and Sybase. She holds a bachelor's of science degree in computer information systems from Purdue University, specializing in systems analysis and design, and has an MBA with emphasis in management information systems from Loyola University.