Problem solve Get help with specific problems with your technologies, process and projects.

Improving performance with histograms

Improve performance using histograms.

When dealing with a table that has highly skewed data distribution, histograms are an excellent way to improve performance by getting the optimizer to choose your index.

Recently I was tuning some SQL that used a column to flag whether a row should be processed or not. For example the table had a column called JOB_ID, 99% of the million rows in the table had a value of 0, and the other 1% of the rows had a JOB_ID of the PID of the job processing the data. There was an index built on JOB_ID, but the optimizer wouldn't use the index, because the cardinality of the index was so poor. In this case a histogram helped to get the optimizer to use the index. Histograms are created using the "FOR COLUMNS" option of the analyze command. For example, I used:

  ANALYZE TABLE INVENTORY_COST ESTIMATE STATISTICS SAMPLE 20 PERCENT FOR
 COLUMNS JOB_ID SIZE 10;

Determining where histograms are used in your database is a little tricky because Oracle considers all cost-based statistics as histograms, so if you look in DBA_HISTOGRAMS, you will find 2 rows for every table in the database. I have been using the following query to determine what table have histograms:

  select 
    distinct TABLE_NAME, COLUMN_NAME 
  from 
    dba_histograms 
  where 
    endpoint_number not in (1,0) 
  order by table_name, column_name ;

Follow-up note

After this tip was published, member Zach Friese offered this feedback: "Histograms are indeed very useful for addressing the problem of getting the optimizer to use a low cardinality index when the data distribution is skewed. However, this tip neglected a very important detail: the fact that histograms can only be used by the optimizer when literal variables are used in the sql, and not when bound variables are used. Given that many DBA's go to great lengths to encourage the use of bound variables due to the benefits of sharing sql in the shared pool, they also need to be aware of this one time when bound variables are not your friends.

About the Author

James Giordano is an Oracle database administrator. He has been working with Oracle for about seven years, and also has experience with UNIX and PeopleSoft/Oracle financials.

For More Information

  • What do you think about this tip? E-mail us at editor@searchDatabase.com with your feedback.
  • The Best SQL Web Links: tips, tutorials, scripts, and more.
  • Have an SQL tip to offer your fellow DBA's? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical SQL questions--or help out your peers by answering them--in our live discussion forums.
  • Check out our Ask the Experts feature: Our SQL, Database Design, Oracle, SQL Server, and DB2 gurus are waiting to answer your toughest questions.

Dig Deeper on Oracle database performance problems and tuning

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide.com

SearchDataCenter

SearchContentManagement

SearchHRSoftware

Close