Oracle9i introduced the new dynamic sampling method for gathering run-time schema statistics, and it is now enabled by default in Oracle10g. But you may want to turn it off to remove unnecessary overhead in some circumstances. This excerpt, taken from Oracle Silver Bullets by Donald K. Burleson, explains what those situations are. Click here to view the entire chapter.
Inside Oracle10g Dynamic Sampling
One of the greatest problems with the Oracle CBO was the failure of the Oracle DBA to gather accurate schema statistics. Even with the dbms_stats package, the schema statistics were often stale and the DBA did not always create histograms for skewed data columns and data columns that are used to estimate the size of SQL intermediate result sets.
This resulted in a bum rap for Oracle's CBO, and beginner DBA's often falsely accused the CBO of failing to generate optimal execution plans when the real cause of the sub-optimal execution plan was the DBA's failure to collect complete schema statistics.
Using the enhanced dbms_stats package, Oracle will automatically estimate the sample size, detect skewed columns that would benefit from histograms, and refresh the schema statistics when they become stale. This automates a very important DBA task and ensures that Oracle always has the statistics that it needs to make good execution plan choices.
However, there was always a nagging problem with the CBO. Even with good statistics, the CBO would sometimes determine a sub-optimal table-join order causing unnecessarily large intermediate result sets.
Even with the best schema statistics, it can be impossible to predict the optimal table join order, the table join order that has the smallest intermediate baggage. As can be expected, reducing the size of the intermediate row sets can greatly improve the speed of the query.
If one were to assume that there is a three-way table join against tables that all contain over 10,000 rows each. This database has 50,000 student rows, 10,000 course rows and 5,000 professor rows as shown in Figure 10.2.
If the number of rows in the table determined the best table join order, the expectation would be that any 3-way table join would start by joining the professor and course tables, and then join the RESULT set to the student table.
However, whenever there is a WHERE clause, the total number of rows in each table does not matter if index access is being used. The following is the query:
select student_name from professor natural join course natural join student where professor = 'jones' and course = 'anthropology 610'; Stan Nowakowski Bob Crane James Bakke Patty O'Furniture 4 Rows selected.Despite the huge numbers of rows in each table, the final result set will only be four rows. If the CBO can predict the size of the final result, sampling techniques can be used to examine the WHERE clause of the query and determine which two tables should be joined first.