Ask the Expert

The difference between NOT IN and NOT EXISTS

What's the difference between NOT IN and NOT EXISTS in SQL? Which is better to use in SQL queries for the performance aspects?

    Requires Free Membership to View

For the first question, let's assume we're talking about the difference in syntax between NOT IN and NOT EXISTS when both have subqueries.

Here are examples of the syntax:

select client_name
  from clients
 where client_id not in
       ( select client_id
           from clientwarnings )
select client_name
  from clients as x
 where not exists
       ( select client_id
           from clientwarnings
          where client_id = x.client_id )

In the NOT IN example, the subquery identifies a set of client_ids, against which each client_id in the clients table can be compared. In the NOT EXISTS example, the subquery is correlated and can be thought of as doing a lookup for each client_id in the clients table.

Your second question asks which is preferred for performance reasons. My answer is: it's your call. Don't trust your instinct, because it might be wrong.

My guess would be that the NOT IN query has to build a list for the subquery, which is slow, but then it can do in-memory compares, which is fast, while the NOT EXISTS query can use matching indexes, which are faster than building a memory table ... As you can see, I am easily confused.

And what about the LEFT OUTER JOIN testing for IS NULL? A perfectly good alternate solution:

select c.client_name
  from clients as c
left outer
  join clientwarnings as cw
    on c.client_id = cw.client_id
 where cw.client_id is null    

Bottom line: use EXPLAIN on your queries to look at the execution plans and optimizer costs for each query, and base your decision about performance on an understanding of what your optimizer is telling you about your data.

All things being equal—and sometimes three different syntaxes such as the above can produce the same execution plan—you might want to consider which of the above three queries is the easiest to maintain, based on how quickly a maintenance programmer can recognize what the query is supposed to be doing.

The answer is: none of them! They aren't documented! Let's add the following line of code at the beginning of each of the three queries above:

/* returns names of clients who have no warnings */

Now each of them makes more sense, right?

This was first published in September 2005

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: