Do nulls and indexes work together?

Tom Kyte answers the frequently asked question, "Do nulls and indexes work together?"

This excerpt is from Tom Kyte's new book Expert Oracle Database Architecture: 9i and 10g Programming Techniques and Solutions, published by Apress in September 2005. Tom Kyte is Vice President of the Core Technologies Group at Oracle and has been using Oracle since 1988. For more frequently asked questions and myths about Oracle indexes, click here.

Do nulls and indexes work together?

B*Tree indexes, except in the special case of cluster B*Tree indexes, do not store completely null entries, but bitmap and cluster indexes do. This side effect can be a point of confusion, but it can actually be used to your advantage when you understand what not storing entirely null keys implies.

To see the effect of the fact that null values are not stored, consider this example:

ops$tkyte@ORA10GR1> create table t ( x int, y int );
Table created.

ops$tkyte@ORA10GR1> create unique index t_idx on t(x,y);
Index created.

ops$tkyte@ORA10GR1> insert into t values ( 1, 1 );
1 row created.

ops$tkyte@ORA10GR1> insert into t values ( 1, NULL );
1 row created.

ops$tkyte@ORA10GR1> insert into t values ( NULL, 1 );
1 row created.

ops$tkyte@ORA10GR1> insert into t values ( NULL, NULL );
1 row created.

ops$tkyte@ORA10GR1> analyze index t_idx validate structure;
Index analyzed.

ops$tkyte@ORA10GR1> select name, lf_rows from index_stats;

NAME                             LF_ROWS
------------------------------ ----------
T_IDX                                   3

The table has four rows, whereas the index only has three. The first three rows, where at least one of the index key elements was not null, are in the index. The last row with (NULL, NULL) is not in the index. One of the areas of confusion is when the index is a unique index, as just shown. Consider the effect of the following three INSERT statements:

ops$tkyte@ORA10GR1> insert into t values ( NULL, NULL );
1 row created.

ops$tkyte@ORA10GR1> insert into t values ( NULL, 1 );
insert into t values ( NULL, 1 )
*
ERROR at line 1:
ORA-00001: unique constraint (OPS$TKYTE.T_IDX) violated

ops$tkyte@ORA10GR1> insert into t values ( 1, NULL );
insert into t values ( 1, NULL )
*
ERROR at line 1:
ORA-00001: unique constraint (OPS$TKYTE.T_IDX) violated

The new (NULL, NULL) row is not considered to be the same as the old row with (NULL, NULL):

ops$tkyte@ORA10GR1> select x, y, count(*)
   2 from t
   3 group by x,y
   4 having count(*) > 1;
         X          Y        COUNT(*)
---------- ----------  --------------
                                    2

This seems impossible; our unique key isn't unique if we consider all null entries. The fact is that, in Oracle, (NULL, NULL) is not the same as (NULL, NULL) when considering uniqueness -- the SQL standard mandates this. (NULL,NULL) and (NULL,NULL) are considered the same with regard to aggregation, however. The two are unique for comparisons but are the same as far as the GROUP BY clause is concerned. That is something to consider: each unique constraint should have at least one NOT NULL column to be truly unique.

The question that comes up with regard to indexes and null values is, "Why isn't my query using the index?" The query in question is something like the following:

select * from T where x is null;

This query cannot use the index we just created -- the row (NULL, NULL) simply is not in the index, hence the use of the index would in fact return the wrong answer. Only if at least one of the columns is defined as NOT NULL can the query use an index. For example, the following shows Oracle will use an index for an X IS NULL predicate if there is an index with X on the leading edge and at least one other column in the index is NOT NULL:

ops$tkyte@ORA10GR1> create table t ( x int, y int NOT NULL );
Table created.

ops$tkyte@ORA10GR1> create unique index t_idx on t(x,y);
Index created.

ops$tkyte@ORA10GR1> insert into t values ( 1, 1 );
1 row created.

ops$tkyte@ORA10GR1> insert into t values ( NULL, 1 );
1 row created.

ops$tkyte@ORA10GR1> begin
   2 dbms_stats.gather_table_stats(user,'T');
   3 end;
   4 /
PL/SQL procedure successfully completed.

When we go to query that table this time, we'll discover this:

ops$tkyte@ORA10GR1> set autotrace on
ops$tkyte@ORA10GR1> select * from t where x is null;

         X          Y
---------- ----------
                    1

Execution Plan
----------------------------------------------------------
      0        SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=5)
      1    0      INDEX (RANGE SCAN) OF 'T_IDX' (INDEX (UNIQUE)) (Cost=1 Card=1 Bytes=5)

Previously, I said that you can use to your advantage the fact that totally null entries are not stored in a B*Tree index -- here is how. Say you have a table with a column that takes exactly two values. The values are very skewed; say, 90% or more of the rows take on one value and 10% or less take on the other value. You can index this column efficiently to gain quick access to the minority rows. This comes in handy when you would like to use an index to get to the minority rows, but you want to full scan to get to the majority rows, and you want to conserve space. The solution is to use a null for majority rows and whatever value you want for minority rows or, as demonstrated earlier, use a function-based index to index only the non-null return values from a function.

Now that you know how a B*Tree will treat null values, you can use that to your advantage and take precautions with unique constraints on sets of columns that all allow nulls (be prepared to have more than one row that is all null as a possibility in this case).

To see other frequently asked questions and myths about indexes, click here.


This was first published in October 2005
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close