OCA Oracle Database 11g: SQL Fundamentals I Exam Guide
Chapter 8: Using Subqueries to Solve Problems
OCA Oracle Database 11g: SQL Fundamentals I Exam Guide, Chapter 8
Table of contents:
Using subqueries in SQL
There are many situations where you will need the result of one query as the input for another.
Use of a Subquery Result Set for Comparison Purposes
Which employees have a salary that is less than the average salary? This could be answered by two statements, or by a single statement with a subquery. The following example uses two statements:
select avg(salary) from employees; select last_name from employees where salary < result_of_previous_query ;
Alternatively, this example uses one statement with a subquery:
select last_name from employees where salary < (select avg(salary)from employees);In this example, the subquery is used to substitute a value into the WHERE clause of the parent query: it is returning a single value, used for comparison with the rows retrieved by the parent query.
The subquery could return a set of rows. For example, you could use the following to find all departments that do actually have one or more employees assigned to them:
select department_name from departments where department_id in (select distinct(department_id) from employees);
In the preceding example, the subquery is used as an alternative to a join. The same result could have been achieved with the following:
select department_name from departments inner join employees on employees.department_id = departments.department_id group by department_name;
If the subquery is going to return more than one row, then the comparison operator must be able to accept multiple values. These operators are IN, NOT IN, ANY, and ALL. If the comparison operator is EQUAL, GREATER THAN, or LESS THAN (which each can only accept one value), the parent query will fail.
An extension of the use of subqueries as an alternative to a join is to enable the star transformation often needed in data warehouse applications. Consider a large table recording sales. Each sale is marked as being of a particular product to a particular buyer through a particular channel. These attributes are identified by codes, used as foreign keys to dimension tables with rows that describe each product, buyer, and channel. To identify all sales of books to buyers in Germany through Internet orders, one could run a query like this:
select … from sales s, products p, buyers b, channels c where s.prod_code=p.prod_code and s.buy_code=b.buy_code and s.chan_code=c.chan_code and p.product='Books' and b.country='Germany' and c.channel='Internet';
This query uses the WHERE clause to join the tables and then to filter the results. The following is an alternative query that will yield the same result:
select … from sales where prod_code in (select prod_code from products where product='Books') and buy_code in (select buy_code from buyers where country='Germany') and chan_code in (select chan_code from channels where channel='Internet);
The rewrite of the first statement to the second is the star transformation. Apart from being an inherently more elegant structure (most SQL developers with any sense of aesthetics will agree with that), there are technical reasons why the database may be able to execute it more efficiently than the original query. Also, star queries are easier to maintain; it is very simple to add more dimensions to the query or to replace the single literals ('Books,' 'Germany,' and 'Internet') with lists of values.
Generate a Table from Which to SELECT
Subqueries can also be used in the FROM clause, where they are sometimes referred to as inline views. Consider another problem based on the HR schema: employees are assigned to a department, and departments have a location. Each location is in a country. How can you find the average salary of staff in a country, even though they work for different departments? Like this:
select avg(salary),country_id from (select salary,department_id,location_id,country_id from employees natural join departments natural join locations) group by country_id;
The subquery constructs a table with every employee's salary and the country in which his department is based. The parent query then addresses this table, averaging the SALARY and grouping by COUNTRY_ID.
Generate Values for Projection
The third place a subquery can go is in the SELECT list of a query. How can you identify the highest salary and the highest commission rate and thus what the maximum commission paid would be if the highest salaried employee also had the highest commission rate? Like this, with two subqueries:
select (select max(salary) from employees) * (select max(commission_pct) from employees) / 100 from dual;
In this usage, the SELECT list used to project columns is being populated with the results of the subqueries. A subquery used in this manner must be scalar, or the parent query will fail with an error.
Generate Rows to be Passed to a DML Statement
DML statements are covered in detail in Chapter 10. For now, consider these examples:
insert into sales_hist select * from sales where date > sysdate-1; update employees set salary = (select avg(salary) from employees); delete from departments where department_id not in (select department_id from employees);
The first example uses a subquery to identify a set of rows in one table that will be inserted into another. The second example uses a subquery to calculate the average salary of all employees and passes this value (a scalar quantity) to an update statement. The third example uses a subquery to retrieve all DEPARTMENT_IDs that are in use and passes the list to a DELETE command, which will remove all departments that are not in use.
Note that it is not legal to use a subquery in the VALUES clause of an insert statement; this is fine:
insert into dates select sysdate from dual; But this is not: insert into dates (date_col) values (select sysdate fom dual);EXERCISE 8-2
More Complex Subqueries
In this exercise, you will write some more complicated subqueries. Use either SQL*Plus or SQL Developer. All the queries should be run when connected to the HR schema.
- Log on to your database as user HR.
- Write a query that will identify all employees who work in departments located in the United Kingdom. This will require three levels of nested subqueries:
select last_name from employees where department_id in (select department_id from departments where location_id in (select location_id from locations where country_id = (select country_id from countries where country_name='United Kingdom') ) );
- Check that the result from step 2 is correct by running the subqueries independently.
First, find the COUNTRY_ID for the United Kingdom:
select country_id from countries where country_name='United Kingdom';
The result will be UK. Then find the corresponding locations:
select location_id from locations where country_id = 'UK';
The LOCATION_IDs returned will be 2400, 2500, and 2600. Then find the DEPARTMENT_IDs of department in these locations:
select department_id from departments where location_id in (2400,2500,2600);The result will be two departments, 40 and 80. Finally, find the relevant employees:
select last_name from employees where department_id in (40,80);
- Write a query to identify all the employees who earn more than the average and who work in any of the IT departments. This will require two subqueries, not nested:
select last_name from employees where department_id in (select department_id from departments where department_name like 'IT%') and salary > (select avg(salary) from employees);
Download the chapter "Using Subqueries to Solve Problems", which includes practice exercises and a self-test, in PDF form.
Continue to the next section: List the types of SQL subqueries
This was first published in October 2008