Indexing NULL table column values for fast SQL performance

0
 

Indexing NULL table column values for fast SQL performance


Note: Starting in Oracle 11g, there is new "create index" syntax that allows NULL values to be included in the index:

create index
    emp_ename_idx
on
   emp 
   (ename asc, 1)
;

Here, the "1" tells Oracle that to index on NULL values within the tables.


One problem with pre 11g databases (see above) is having the optional ability to index on a NULL column.  By default, relational databases ignore NULL values (because the relational model says that NULL means "not present").  Hence, Oracle indexes will not include NULL values. 

For example, this index definition would not index on "open positions", new employee positions that are stored with a NULL employee name:

create index
    emp_ename_idx
on
   emp 
   (ename)
;

Whenever a SQL query asks for the open position employee slots "where ename is NULL", there will be no index entries for NULLS in emp_name_idx and Oracle would perform an unnecessary large-table full-table scan.

Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=6) 
1 0 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=6)

To get around the optimization of SQL queries that choose NULL column values, we can create a function-based index using the null value built-in SQL function to index only on the NULL columns. 

Note that the "null value" (NVL) function replaces NULL values with the character string "null', a real value that can participate in an index:

 
-- create an FBI on ename column with NULL values
create index
    emp_null_ename_idx
on
   emp 
   (nvl(ename,'null'));
 
analyze index emp_null_ename_idx compute statistics;
 
You can also do this techniques with NULL numeric values.  This syntax replaces NULL values with a zero:
 
-- create an FBI on emp_nbr column with NULL values
create index
    emp_null_emp_nbr_idx
on
   emp 
   (nvl(ename,o));
 
analyze index emp_null_ename_idx compute statistics;

Now we can use the index and greatly improve the speed of any queries that require access to the NULL columns.  Note that we must make one of two changes:

    1- Add a hint to force the index

   2 - Change the WHERE predicate to match the function

Here is an example of using an index on NULL column values:

-- insert a NULL row
insert into emp (empno) values (999);

set autotrace traceonly explain;

-- test the index access (change predicate to use FBI)
select /*+ index(emp_null_ename_idx) */
   ename
from
   emp e
where
   nvl(ename,'null') = 'null'
;