Q

The difference between NOT IN and NOT EXISTS

What's the difference between NOT IN and NOT EXISTS in SQL? Which is better to use in SQL queries for the performance aspects?

What's the difference between NOT IN and NOT EXISTS in SQL? Which is better to use in SQL queries for the performance aspects?

For the first question, let's assume we're talking about the difference in syntax between NOT IN and NOT EXISTS when both have subqueries.

Here are examples of the syntax:

select client_name
  from clients
 where client_id not in
       ( select client_id
           from clientwarnings )
select client_name
  from clients as x
 where not exists
       ( select client_id
           from clientwarnings
          where client_id = x.client_id )

In the NOT IN example, the subquery identifies a set of client_ids, against which each client_id in the clients table can be compared. In the NOT EXISTS example, the subquery is correlated and can be thought of as doing a lookup for each client_id in the clients table.

Your second question asks which is preferred for performance reasons. My answer is: it's your call. Don't trust your instinct, because it might be wrong.

My guess would be that the NOT IN query has to build a list for the subquery, which is slow, but then it can do in-memory compares, which is fast, while the NOT EXISTS query can use matching indexes, which are faster than building a memory table ... As you can see, I am easily confused.

And what about the LEFT OUTER JOIN testing for IS NULL? A perfectly good alternate solution:

select c.client_name
  from clients as c
left outer
  join clientwarnings as cw
    on c.client_id = cw.client_id
 where cw.client_id is null 

Bottom line: use EXPLAIN on your queries to look at the execution plans and optimizer costs for each query, and base your decision about performance on an understanding of what your optimizer is telling you about your data.

All things being equal—and sometimes three different syntaxes such as the above can produce the same execution plan—you might want to consider which of the above three queries is the easiest to maintain, based on how quickly a maintenance programmer can recognize what the query is supposed to be doing.

The answer is: none of them! They aren't documented! Let's add the following line of code at the beginning of each of the three queries above:

/* returns names of clients who have no warnings */

Now each of them makes more sense, right?

This was first published in September 2005

Dig deeper on Oracle and SQL

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close