|
First, I applaud you for your efforts! "Bad" SQL is the predominant cause
of performance problems in most databases. Second, I'd suggest to your
development team that there is NEVER one way to write a SQL statement and it
is imperative that statements be written and tested in various ways to
insure optimal performance. You've obviously found a better way to write
the statement so that performance is significantly enhanced. Why argue with
success?!?!?
Now, as to understanding why the difference is so great, you have to
understand a bit about Oracle's optimizer and how it chooses an execution
path/plan for your SQL statement. The optimizer looks at various scenarios
for how to accomplish what your SQL statement is requesting and costs each
one. This cost is basically a number that the optimzer uses to determine
how expensive a particular execution path is. The higher the number, the
less likely the optimizer is to choose that path for executing your
statement. Out of all possible scenarios it reviews, the optimizer will
choose the path which has the lowest cost.
What you give the optimizer to work with is the key. If you pass the
optimizer a poorly constructed SQL statement, then you handicap your efforts
right from the start. In your case, let's look at the two statements and
see why one would be more costly (and thus take more time to execute) than
the other.
What the first statement
is telling the optimizer to do in order to resolve the
statement is this:
For each row that is being deleted, issue the select statement (in parens)
and see if it returns an answer. In other words, if the select returns 1 or
more rows, then it exists and the condition of the where clause is met so
the row should be deleted. Imagine this occuring where the table you are
deleting from has 1000 rows. For each row individually, the sub-select in
the where clause must be executed (that's 1000 executions). Now imagine if
the table had 1 million rows! Ouch! The interior query will ALWAYS return
the same answer the way it is written, so why in the world would you want to
execute it more than once? Even if the optimizer can execute the interior
select very very fast, it still has to execute it numerous times... that's
bad for performance.
The reason why your rewritten statement executes so much faster than the first is that
the interior select only executes once. The answer to the sub-select is
returned once at the beginning, and the values it returns are compared to the
value in the column of the table from which you wish to delete. In the case
of 1000 rows, you just eliminated 999 executions of the interior
select... in the case of 1 million rows, you eliminate 999,999 executions
and so on.
Now to determine what the optimizer did with each, you can use EXPLAIN PLAN
or AUTOTRACE in SQL*Plus. For AUTOTRACE, simply issue the command SET
AUTOTRACE ON and then execute the statement. You will see the execution
path the optimizer chose as well as the cost it assigned to the statement.
Do this for both statements and you should see the difference. A word of
caution... just because you see a glaring difference in execution time for
the two statements doesn't mean that there will be a great cost difference.
One could be costed at 10 and the other at 12. Who knows? The point is
that you should concentrate on what the optimizer shows you it has to do in
order to resolve the query.
I'd suggest that you get a good SQL tuning book and have your developers
read it and/or keep it handy so that you can have more ready information the
next time you need to re-write a statement. Some good options are books by
Guy Harrison (Oracle SQL High-Performance Tuning) and Don Burleson (High
Performance SQL Tuning). There are more out there, but I've found both of
these to be top-notch.
|