OUTER JOIN instead of a NOT EXISTS subquery
The following statement works for Microsoft SQL Server but not in MySQL. Could you translate it into a join select?
SELECT M.ModulId, M.Modul FROM MODULE M WHERE NOT EXISTS ( SELECT PROJEKTMODUL.IdModul FROM PROJEKTMODUL WHERE M.ModulId = PROJEKTMODUL.IdModul AND PROJEKTMODUL.IdProjekt = 23 )
Before we rewrite this query, let's analyze what it's doing. This query uses a correlated subquery. For every row in the outer query, the subquery attempts to find all related rows in another table, with a specific condition. If at least one such a related row exists, then the EXISTS condition is true, the NOT EXISTS is false, and therefore the row in the outer query is not selected.
The same result can be achieved with an OUTER JOIN that tests for unmatched rows:
select distinct M.ModulId , M.Modul from MODULE M left outer join PROJEKTMODUL on M.ModulId = PROJEKTMODUL.IdModul and PROJEKTMODUL.IdProjekt = 23 where PROJEKTMODUL.IdProjekt is null
Many people have difficulty with this particular construction the first time they see it, because it sure looks like it's joining to IdProjekt 23, but then it's testing to see if IdProjekt is NULL! What's up with that?
Remember that in a LEFT OUTER join, every row in the left table is represented in the result set, whether or not there are any matching rows from the right table. If for a particular row of the left table there is no matching row from the right table, then the columns in that row which would have come from the right table will be NULL.
Note that the condition to test for IdProjekt 23 must be written in the ON clause of the JOIN. Those are the only rows of the right table we're interested in, and we need to know when they don't exist.
The test for NULL could actually be made on any column of the right table, but when the ON conditions include specific criteria, I like to use one of those columns, perhaps just for the double-take effect. Many SQL writers will instead always code the NULL test on the primary key of the related table.
Finally, since there may be more than one row of the right table for each row of the left, we need DISTINCT so that each left row is represented in the result set only once.
For More Information
- Dozens more answers to tough SQL questions from Rudy Limeback.
- The Best SQL Web Links: tips, tutorials, scripts, and more.
- Have an SQL tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize. Submit your tip today!
- Ask your technical SQL questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our SQL, database design, Oracle, SQL Server, DB2, metadata, object-oriented and data warehousing gurus are waiting to answer your toughest questions.