Why "select star" is bad
I want to display all 27 columns in my table. I used SELECT * ... Is this right? Or should I use all column names instead? Is there any performance issue, like one way is faster? Is there any disadvantage if I use SELECT * ...?
There are many reasons why "select star" is bad. Foremost is the harm this can do to performance.
Imagine using SELECT * to retrieve rows from a table which includes a TEXT or BLOB column. A TEXT or BLOB column is large, and is used to store data like the contents of a chapter in a book, or an image.
If you do not intend to display the TEXT or BLOB column, for example if you wanted a list of the last 25 entries this week, then it is immediately obvious that SELECT * is bad when you display the query results on the screen or printer. You don't see a list, you see many kilobytes of TEXT or BLOB data. This prompts some people to post questions like "How do I write SELECT * except for column x?" on database forums. Answer: you can't; you have to list the columns you want.
Programming languages which can format query results, however, allow the programmer to execute SELECT * queries, but then decide which columns to display. If the TEXT or BLOB column is not displayed, everything still looks okay. But bandwidth is wasted, several kilobytes per TEXT or BLOB column, times N rows returned, on every execution. In addition to wasted bandwidth, performance also suffers, especially if a sort is required, such as when DISTINCT or ORDER BY is specified.
The sad fact is that this bad situation is not immediately apparent. In a testing environment, the programmer may have, ahem, other priorities. During volume testing, the poor performance is noticed, but now the programmer has to perform emergency debugging on his own code, to rediscover which columns are actually needed by each module. And if your organization does not do volume testing, you may not realize that SELECT * is bad for performance until the program goes live.
That said, if you do actually need all columns, then SELECT * is fine, provided that you never change the table. Or the query.
For instance, if you needed to change the query, say to join to another table to get a count of related rows, you cannot say GROUP BY * so you have to list the individual columns anyway, even if you need all of them.
Skilled developers will always explicitly list the minimum set of columns needed. The SELECT list thus describes the data used by the code module, and so is self-documenting. Bandwidth is minimized and performance is optimized, especially considering the increased possibility that the query might be satisfied by a covering index.
See also Why Select * is bad for a couple of additional reasons from an ASP perspective.
For More Information
- Dozens more answers to tough SQL questions from Rudy Limeback.
- The Best SQL Web Links: tips, tutorials, scripts, and more.
- Have an SQL tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize. Submit your tip today!
- Ask your technical SQL questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our SQL, database design, SQL Server, DB2, object-oriented and data warehousing gurus are waiting to answer your toughest questions.