Ask the Expert

OUTER JOIN instead of a NOT EXISTS subquery

The following statement works for Microsoft SQL Server but not in MySQL. Could you translate it into a join select?

SELECT M.ModulId, M.Modul FROM MODULE M
WHERE NOT EXISTS
( SELECT PROJEKTMODUL.IdModul 
  FROM PROJEKTMODUL 
  WHERE M.ModulId = PROJEKTMODUL.IdModul 
  AND PROJEKTMODUL.IdProjekt = 23 )

    Requires Free Membership to View

Before we rewrite this query, let's analyze what it's doing. This query uses a correlated subquery. For every row in the outer query, the subquery attempts to find all related rows in another table, with a specific condition. If at least one such a related row exists, then the EXISTS condition is true, the NOT EXISTS is false, and therefore the row in the outer query is not selected.

The same result can be achieved with an OUTER JOIN that tests for unmatched rows:

select distinct
       M.ModulId
     , M.Modul 
  from MODULE M
left outer
  join PROJEKTMODUL
    on M.ModulId = PROJEKTMODUL.IdModul
   and PROJEKTMODUL.IdProjekt = 23
 where PROJEKTMODUL.IdProjekt is null

Many people have difficulty with this particular construction the first time they see it, because it sure looks like it's joining to IdProjekt 23, but then it's testing to see if IdProjekt is NULL! What's up with that?

Remember that in a LEFT OUTER join, every row in the left table is represented in the result set, whether or not there are any matching rows from the right table. If for a particular row of the left table there is no matching row from the right table, then the columns in that row which would have come from the right table will be NULL.

Note that the condition to test for IdProjekt 23 must be written in the ON clause of the JOIN. Those are the only rows of the right table we're interested in, and we need to know when they don't exist.

The test for NULL could actually be made on any column of the right table, but when the ON conditions include specific criteria, I like to use one of those columns, perhaps just for the double-take effect. Many SQL writers will instead always code the NULL test on the primary key of the related table.

Finally, since there may be more than one row of the right table for each row of the left, we need DISTINCT so that each left row is represented in the result set only once.

For More Information


This was first published in February 2003

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: