NULL values are one of those things that people claim to understand, but they often write code that doesn't reflect how NULL values are actually handled. Because of this, it's all too easy to create queries that don't behave correctly, even when they might outwardly seem to be all right.
Here's an example from my own experience: Let's say we have a table named tb_user_applications, with a char column named pending that can contain a NULL value, a "P" (for "pending") or "Y" (for "yes" or "approved"). If we constructed a query that went like this:
SELECT ITEM_ID FROM
TB_USER_APPLICATIONS
WHERE PENDING IS NOT NULL
we'd end up with everything where pending was set to "P" or "Y." But if we did this:
SELECT ITEM_ID FROM
TB_USER_APPLICATIONS
WHERE PENDING<>'Y'
…then we would get everything only set to "P." The NULL values would be omitted.
This is by design because of the way DBMS regards NULL values. They are not grouped in with other distinct values; they exist in a class by themselves. NULL has to be evaluated on its own terms in all instances. This is something many people think they know, until they run across a living example of it and their code breaks.
If you are making use of NULL in a column, make sure that NULL is evaluated and selected independently of any actual values. Otherwise you'll run into problems like the one described above, where the programmer assumes that NULL values will be selected along with non-NULL values.
In some cases, it might be better to redesign the database to avoid the use of NULLs entirely (especially if you spend most of your time programming around it instead of accommodating it!). Instead of Nulls, use constraints and defaults to keep valid values in the column in question.
Serdar Yegulalp is editor of the Windows Power Users Newsletter. Check it out for the latest advice and musings on the world of Windows network administrators -- and please share your thoughts as well!