How would you change the below query to work using outer join?
SELECT t1.itm_id FROM tctg_itm_item t1 WHERE t1.itm_company_id = 40001 AND t1.itm_catalog_id = 167 AND t1.itm_next_version_id = 999999999 AND NOT EXISTS ( SELECT 1 FROM tctg_ita_item_attributes WHERE t1.itm_id = ita_item_id AND ita_next_version_id = 999999999 and ita_catalog_id = 167 AND ita_node_id = 12967 AND ita_value_string IS NOT NULL )
I want to get rid of NOT EXISTS as it is causing performance problems.
The outer join may be no faster than your correlated NOT EXISTS subquery.
Query performance depends on many factors: which database system you're using, how well it is configured, whether the join and search columns have indexes, the number of rows in each table, and how selective the indexes are (see, for example, Not All SQL Server Indexes Are Created Equal).
Performance problems fall within the purview of the database administrator, not the query writer. Trying different but logically equivalent versions of a query should be well down the checklist for improving a poorly performing query.
That said, let's do it anyway.
Two queries, logically equivalent, which return exactly the same results, should perform the same. Right? Heh.
In theory, theory and practice are the same, but in practice, they're not.
So here's your query with the correlated NOT EXISTS subquery re-written as an outer join:
select t1.itm_id from tctg_itm_item t1 left outer join tctg_ita_item_attributes on t1.itm_id = ita_item_id and ita_next_version_id = 999999999 and ita_catalog_id = 167 and ita_node_id = 12967 and ita_value_string is not null where t1.itm_company_id = 40001 and t1.itm_catalog_id = 167 and t1.itm_next_version_id = 999999999 and ita_item_id is null
The last line, in bold, accomplishes the NOT EXISTS effect.
Just for fun, try the following query, which will be easier to customize for different search criteria.
select t1.itm_id from tctg_itm_item t1 left outer join tctg_ita_item_attributes on t1.itm_id = ita_item_id and t1.itm_next_version_id = ita_next_version_id and t1.itm_catalog_id = ita_catalog_id and ita_node_id = 12967 and ita_value_string is not null where t1.itm_company_id = 40001 and t1.itm_catalog_id = 167 and t1.itm_next_version_id = 999999999 and ita_item_id is null
Note the subtle difference in the join conditions, which might (or not, eh) improve performance. Good luck!
This was first published in September 2004