I have a critical performance issue due to the large volume of data for a specific customer. We have a table Cust-orders, and a customer XYZ -- the number of orders for XYZ stored in the table is more that 50 lakh. Beside XYZ there are 10,000 customers' orders residing in the table but the number of orders for each of them is not more than 5,000.
The problem is whenever customer XYZ's data is being fetched the query takes about half an hour to execute, whereas it fetches the data in seconds for the rest of the customers. I have analyzed and rebuilt the indexes. I have properly ordered the tables and where conditions in my query based on tuning thumb rules. But still the problem persists.
Is there any way to handle this kind of scenario? How can I enhance the performance of the query so it takes less time while fetching the data for Customer XYZ?
You don't really say but I am assuming here that you have an Index on the table's Customer ID.
First of all, are you using the DBMS_STATS package to gather stats on your tables? If not, you should be. The analyze table command does not collect all the columns' histogram statistics. If column statistics are not properly gathered, then a column that is indexed but has a high proportion of rows with a certain value may be using the index to retrieve data for that value, when a table scan or other access path would be more efficient. In Oracle 10g, Oracle has automated the gathering of statistics unless it has been disabled. If not, then use the DBMS_Stats.Gather_Schema_Stats to collect statistics on all your schema's table and indexes. Make sure your method option parameter is set to 'for all columns size auto' so that that Oracle collects histogram statistics on your columns.
If your statistics are good, then start a SQL trace of your session using the following alter session command to start a Level 8 trace so WAIT events are captured.
alter session set events '10046 trace name context forever, level 8';
Next, run the query that selects customer XYZ, and other queries that select the other customers. After you have run these queries, then disconnect from the database or turn off the trace, and go to the database's UDUMP directory on the database server. Find the trace file you just created and then use the TKPROF utility with the EXPLAIN option to read that trace file and generate an easier-to-read format. What access path are the queries taking? Are they the same? Also look at the Wait Events for each query to see what the largest waiting event is. Using this information you will be able to determine what access path it is using, and with that and the WAIT events information determine a solution. If all queries are using that same customer ID index, then my first guess would be that the problem is occurring because an index is being used to select your XYZ customer when a table scan or other access method might be more efficient. If so try using a hint in this query to force the optimizer to use a table scan and see if that helps.
This was first published in June 2007