|
There are many reasons why "select star" is bad.
Foremost is the harm this can do to performance.
Imagine using SELECT * to retrieve rows from a table which includes
a TEXT or BLOB column. A TEXT or BLOB column is large, and is used to store
data like the contents of a chapter in a book, or an image.
If you do not intend to display the TEXT or BLOB column,
for example if you wanted a list of the last 25 entries this week,
then it is immediately obvious
that SELECT * is bad when you display the query
results on the screen or printer. You don't see a list, you
see many kilobytes of TEXT or BLOB data. This prompts some people to post
questions like "How do I write SELECT * except for column x?" on
database forums. Answer: you can't; you have to list the columns you want.
Programming languages which can format
query results, however, allow the programmer to execute SELECT * queries,
but then decide which columns to display.
If the TEXT or BLOB column is not displayed, everything still looks okay.
But bandwidth is wasted, several kilobytes per
TEXT or BLOB column, times N rows returned, on every execution.
In addition to wasted bandwidth, performance also suffers, especially
if a sort is required, such as when DISTINCT or ORDER BY is specified.
The sad fact is that this bad situation is not immediately apparent.
In a testing environment, the programmer may have, ahem, other priorities.
During volume testing, the poor performance is noticed, but now the
programmer has to perform emergency debugging on his own code,
to rediscover which columns are actually needed by each module. And if your
organization does not do volume testing, you may not realize that
SELECT * is bad for performance until the program goes live.
That said, if you do actually need all columns, then SELECT * is fine,
provided that you never change the table. Or the query.
For instance, if you needed to change the query,
say to join to another table to get a count of related rows,
you cannot say GROUP BY * so you have to list the individual
columns anyway, even if you need all of them.
Skilled developers will always explicitly list the minimum
set of columns needed. The SELECT list thus describes the data
used by the code module, and so is self-documenting.
Bandwidth is minimized and performance is optimized, especially
considering the increased possibility that the query might be satisfied by
a covering index.
See also
Why Select * is bad for a couple of additional reasons
from an ASP perspective.
For More Information
|