The following is an excerpt from Chapter 6 of The Art of SQL by Stephane Faroult with Peter Robson, published by O'Reilly Media Inc. in 2006.
Stephane Faroult first discovered relational databases and the SQL language in 1983. He joined Oracle France in their early days and soon developed an interest in performance and tuning topics. He founded RoughSea Ltd., a database consultancy, in 1998.
Peter Robson has worked with databases since 1977, relational databases since 1981 and Oracle since 1985. He has lectured widely in the U.K. on geological aspects of databases and has specialized on aspects of the SQL system as well as data modelling from the corporate architecture down to the departmental level. He is currently a director on the board of the U.K. Oracle Users Group.
The nine situations
Any SQL statement that we execute has to examine some amount of data before identifying a result set that must be either returned or changed. The way that we have to attack that data depends on the circumstances and conditions under which we have to fight the battle. Our attack will depend on the amount of data from which we retrieve our result set and on our forces (the filtering criteria), together with the volume of data to be retrieved.
Any large, complicated query can be divided into a succession of simpler steps, some of which can be executed in parallel, rather like a complex battle is often the combination of multiple engagements between various distinct enemy units. The outcome of these different fights may be quite variable. But what matters is the final, overall result.
When we come down to the simpler steps, even when we do not reach a level of detail as small as the individual steps in the execution plan of a query, the number of possibilities is not much greater than the individual moves of pieces in a chess game. But as in a chess game, combinations can indeed be very complicated.
This chapter examines common situations encountered when accessing data in a properly normalized database. Although I refer to queries in this chapter, these example situations apply to updates or deletes as well, as soon as a where clause is specified; data must be retrieved before being changed. When filtering data, whether it is for a simple query or to update or delete some rows, the following are the most typical situations—I call them the nine situations—that you will encounter:
- Small result set from a few tables with specific criteria applied to those tables.
- Small result set based on criteria applied to tables other than the data source tables.
- Small result set based on the intersection of several broad criteria.
- Small result set from one table, determined by broad selection criteria applied to two or more additional tables.
- Large result set.
- Result set obtained by self-joining on one table.
- Result set obtained on the basis of aggregate function(s).
- Result set obtained by simple searching or by range searching on dates.
- Result set predicated on the absence of other data.
This chapter deals with each of these situations in turn and illustrates them with either simple, specific examples or with more complex real-life examples collected from different programs. Real-life examples are not always basic, textbook, one- or two-table affairs. But the overall pattern is usually fairly recognizable.
As a general rule, what we require when executing a query is the filtering out of any data that does not belong in our final result set as soon as possible; this means that we must apply the most efficient of our search criteria as soon as possible. Deciding which criterion to apply first is normally the job of the optimizer. But, as I discuss in Chapter 4, the optimizer must take into account a number of variable conditions, from the physical implementation of tables to the manner in which we have written a query. Optimizers do not always "get it right," and there are things we can do to facilitate performance in each of our nine situations.