Q

SQL statement gets very long very fast

I have the following SQL statement. I now need to substitute pkColumn in that statement with a composite primary key of, let's say, two integer columns. The problem, besides performance issues, is that the statement gets very long very fast and statements are limited. Is there a better solution that does not make use of a temporary table?

I have the following SQL statement:

SELECT * FROM myTable 
WHERE pkColumn in 
(pkValue1, pkValue2, pkValue3, pkValue4, pkValue5, ...)

I now need to substitute pkColumn in that statement with a composite primary key of, let's say, two integer columns.

SELECT * FROM myTable 
WHERE (pkColumn1 = pk1Value1 AND pkColumn2 = pk2Value1) 
OR (pkColumn1 = pk1Value2 AND pkColumn2 = pk2Value2) 
OR ...

The problem, besides performance issues, is that the statement gets very long very fast and statements are limited.

Is there a better solution that does not make use of a temporary table?

This is another of those sneaky questions which seem to be designed to test my knowledge of performance. Many years ago, I realized that performance is a DBA (database administrator) concern, because there are so many server parameters that can be tweaked and tuned to improve it. And since I'm not a DBA, this implied that I needn't be too concerned with performance, beyond the minimum of "good SQL hygiene" (indexing your join and search columns, writing sargable predicates, and so on).

Nevertheless, your question is a good one, and perhaps my very limited DBA knowledge will help you set your direction.

If you can pass a set of criteria into the optimizer using "hardcoded" values in a query, then that query is going to perform as fast as possible. My reason for believing this to be true is that once the WHERE clause has been parsed, all those pkColumnN=pk1ValueN expressions, together with their ANDs and ORs, will be compiled into memory. Not being a DBA, I can only imagine this process, which probably involves some sort of slick bytecode stack somewhere that is used to evaluate each row. But the key point here is that it will be in memory, which is the most efficient data store of all.

You are right that there will be a limit to the size of the SQL statement that you can construct this way, but this limit will be reasonably large, depending on which database system you're using. In MySQL, for example, you can change this limit via the MAX_ALLOWED_PACKET variable, but its initial default value is 1 megabyte, which is a pretty large SQL statement.

The only other alternative to large cumbersome SQL statements (besides a temp table) would be to see if you can simplify the SQL statement. For example, maybe you could rewrite your query like this:

SELECT * FROM myTable 
WHERE (pkColumn1, pkColumn2) 
 IN (
      (pk1Value1, pk2Value1) 
    , (pk1Value2, pk2Value2) 
    , ...
    )

Same results, but the query string is shorter.

This was first published in April 2006

Dig deeper on Oracle and SQL

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close