Ask the Expert

SQL statement gets very long very fast

I have the following SQL statement:

SELECT * FROM myTable 
WHERE pkColumn in 
(pkValue1, pkValue2, pkValue3, pkValue4, pkValue5, ...)

I now need to substitute pkColumn in that statement with a composite primary key of, let's say, two integer columns.

SELECT * FROM myTable 
WHERE (pkColumn1 = pk1Value1 AND pkColumn2 = pk2Value1) 
OR (pkColumn1 = pk1Value2 AND pkColumn2 = pk2Value2) 
OR ...

The problem, besides performance issues, is that the statement gets very long very fast and statements are limited.

Is there a better solution that does not make use of a temporary table?

    Requires Free Membership to View

This is another of those sneaky questions which seem to be designed to test my knowledge of performance. Many years ago, I realized that performance is a DBA (database administrator) concern, because there are so many server parameters that can be tweaked and tuned to improve it. And since I'm not a DBA, this implied that I needn't be too concerned with performance, beyond the minimum of "good SQL hygiene" (indexing your join and search columns, writing sargable predicates, and so on).

Nevertheless, your question is a good one, and perhaps my very limited DBA knowledge will help you set your direction.

If you can pass a set of criteria into the optimizer using "hardcoded" values in a query, then that query is going to perform as fast as possible. My reason for believing this to be true is that once the WHERE clause has been parsed, all those pkColumnN=pk1ValueN expressions, together with their ANDs and ORs, will be compiled into memory. Not being a DBA, I can only imagine this process, which probably involves some sort of slick bytecode stack somewhere that is used to evaluate each row. But the key point here is that it will be in memory, which is the most efficient data store of all.

You are right that there will be a limit to the size of the SQL statement that you can construct this way, but this limit will be reasonably large, depending on which database system you're using. In MySQL, for example, you can change this limit via the MAX_ALLOWED_PACKET variable, but its initial default value is 1 megabyte, which is a pretty large SQL statement.

The only other alternative to large cumbersome SQL statements (besides a temp table) would be to see if you can simplify the SQL statement. For example, maybe you could rewrite your query like this:

SELECT * FROM myTable 
WHERE (pkColumn1, pkColumn2) 
 IN (
      (pk1Value1, pk2Value1) 
    , (pk1Value2, pk2Value2) 
    , ...

Same results, but the query string is shorter.

This was first published in April 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: