How to Use a Function-Based Index on a Column That Contains Nulls in Oracle 10+

How to use a function-based index on a column that contains NULLs in Oracle 10+?

The index can be used, though the optimiser may have chosen not to use it for your particular example:

SQL> create table my_objects
2 as select object_id, object_name
3 from all_objects;

Table created.

SQL> select count(*) from my_objects;
2 /

COUNT(*)
----------
83783

SQL> alter table my_objects modify object_name null;

Table altered.

SQL> update my_objects
2 set object_name=null
3 where object_name like 'T%';

1305 rows updated.

SQL> create index my_objects_name on my_objects (lower(object_name));

Index created.

SQL> set autotrace traceonly

SQL> select * from my_objects
2 where lower(object_name) like 'emp%';

29 rows selected.

Execution Plan
----------------------------------------------------------

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 510 | 355 (1)|
| 1 | TABLE ACCESS BY INDEX ROWID| MY_OBJECTS | 17 | 510 | 355 (1)|
|* 2 | INDEX RANGE SCAN | MY_OBJECTS_NAME | 671 | | 6 (0)|
------------------------------------------------------------------------------------

The documentation you read was presumably pointing out that, just like any other index, all-null keys are not stored in the index.

Oracle function based indexes

Assuming they are setup correctly, Yes.

OVERVIEW:
Oracle creates a hidden column on the table with the function based index which is updated each time the columns on which the function based index are changed; assuming the values are deterministic (See oracle link for more info). You can see these hidden columns by querying the system object all_tab_cols your table on which the function based index was created. They typically contain a $ and look something like SYS_NC00005$.

They are often used when you want to handle case sensitivity or handle repeated functions calls on which the overhead of executing the function call is with such frequency that the value up front time of computing the value once and retaining it results in a performance gain.

Additional reading:

  • Ask Tom
  • Stack Overflow
  • ORACLE

Utilize a function-based index to precompute columns, while also selecting non-indexed columns

I think I can do it by simply:

  1. Selecting only the indexed columns.
  2. Joining to the non-indexed columns.


select
a.id,
a.startpoint_x,
a.startpoint_y,
b.details1,
b.details2,
b.details3
from
(
select
id,
startpoint_x(shape) as startpoint_x,
startpoint_y(shape) as startpoint_y
from
lines
where --https://stackoverflow.com/a/59581129/5576771
id is not null
and startpoint_x(shape) is not null
and startpoint_y(shape) is not null
) a
left join
lines b
on a.id = b.id

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 219 | 10 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 219 | 10 (0)| 00:00:01 |
|* 2 | INDEX FAST FULL SCAN | LINES_FBI_IDX | 3 | 117 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LINES | 1 | 34 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | LINES_FBI_IDX | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT --------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("INFRASTR"."STARTPOINT_X"("SHAPE") IS NOT NULL AND
"INFRASTR"."STARTPOINT_Y"("SHAPE") IS NOT NULL AND "ID" IS NOT NULL)
4 - access("ID"="B"."ID"(+))

Note
-----
- dynamic statistics used: dynamic sampling (level=2)

I had incorrectly assumed that it would be more complicated than that. I thought I might need to force Oracle to invoke the FBI when selecting the indexed columns, while also doing a table scan to get the unindexed columns. I wondered if some sort of materialized hint would be needed. But it looks like the join did what I want without needing a hint.


In general, the benefit of this overall non-spatial FBI solution is:

It works around a limitation with function-based spatial indexes:

From @DavidLapp of Oracle

A spatial index is invoked only by the WHERE clause, not the SELECT list.

Utilize function-based spatial index in SELECT list

Whereas my non-spatial FBI doesn’t have that problem. It can be invoked by the SELECT list, as long as there are IS NOT NULLs in the where clause.

Next step:

In theory, maybe the resulting numeric XY columns could be
"spatialized" after-the-fact:

A) Convert the XY to SDO_GEOMETRY on-the-fly / after-the-fact in a
query.

or B) Use GIS software to display the XYs as points in the map — using
a dynamic "layer" in the application. For example, use ArcGIS Pro's
"XY Event Layer" functionality (which is surprisingly fast:
https://gis.stackexchange.com/questions/380176/are-xy-event-layers-optimized-indexed-on-the-fly).

Source: “Regarding the idea of indexing/returning a non-spatial datatype…”

Utilize a function-based index while selecting additional columns

The solution is to add the additional column (OBJECTID) to the index — as a composite index:

create index atn_endpoint_list_idx on
my_owner.active_transportation(my_owner.endpoint_list(shape),OBJECTID);
-- br>

Now, the FBI is being used:

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 727 | 299K| 53 (10)| 00:00:01 |
|* 1 | INDEX FAST FULL SCAN| ATN_ENDPOINT_LIST_IDX | 727 | 299K| 53 (10)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("my_owner"."ENDPOINT_LIST"("SHAPE") IS NOT NULL)

How to index a date column with null values?

"Our table has 300,000 records....
280,000 records have a null
delivered_at date. "

In other words almost the entire table satisfies a query which searches on where DELIVERED_AT is null. An index is completely inappropriate for that search. A full table scan is much the best approach.

If you have an Enterprise Edition license and you have the CPUs to spare, using a parallel query would reduce the elapsed time.

Oracle Index - For null values

In Oracle, b-tree index does not contain info about null keys. It means:

create table X (i integer, j integer);

create index X_j on X(j);

insert into X values (1, 1); -- record would be indexed
insert into X values (2, 2); -- record would be indexed
insert into X values (3, null); -- record would not be indexed

select rowid from X; -- index cannot be used
select rowid from X where j = 1; -- index can be used
select rowid from X where j is not null; -- index can be used
select rowid from X where j is null; -- index cannot be used

If indexing several columns, index would contain record if at least one column has not null value.

If index if function-based, same rule exists about expression result.

Bitmap indexes contains null keys. There is no any difference for null and not null values for them.

It does not mean you should use bitmap index if you wish to catch null values. Bitmap indexes requires exclusive access while updating, so they cannot be used into OLTP environment. If you wish to catch null values by index, use function-based index for converting null values into some not-nulls, say

create index employee_firstname_fbi on employees('x' || first_name);
select * from employees where 'x' || first_name = 'x';

Function Based Index not improving query performance

I would suggest to review your data model, the regex is really ugly. Store relevant information directly in column instead of somewhere hidden in a ticket string.

Anyway, I would propose to create a virtual column instead of view. Then you can create an index on this virtual column and it should also be used. Would be similar to this:

ALTER TABLE TTT_IMP ADD (ROOT VARCHAR2(20) GENERATED ALWAYS AS (
CAST(
CASE
WHEN regexp_like(ticker, '\s.*\s')
THEN SUBSTR(ticker, 1, instr(ticker, ' ')-1)
WHEN regexp_like(ticker, '\s')
THEN
CASE
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '(P|C)$')
AND LENGTH(SUBSTR(ticker, 1, instr(ticker, ' ')-1)) >= 4
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-3)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')-1), '\w\d\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-5)
WHEN regexp_like(SUBSTR(ticker, 1, instr(ticker, ' ')), '\w\d\w\d$')
THEN SUBSTR(SUBSTR(ticker, 1, instr(ticker, ' ')-1), 1, LENGTH(SUBSTR(ticker, 1
, instr(ticker, ' ')-1))-4)
ELSE SUBSTR(ticker, 1, instr(ticker, ' ')-1)
END
WHEN regexp_like(ticker, '(P|C)$')
AND LENGTH(ticker) >= 4
THEN SUBSTR(ticker, 1, LENGTH(ticker)-3)
WHEN regexp_like(ticker, '\w\d\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-5)
WHEN regexp_like(ticker, '\w\d\w\d$')
THEN SUBSTR(ticker, 1, LENGTH(ticker)-4)
ELSE ticker
END
AS VARCHAR2(20))
) VIRTUAL);


Related Topics



Leave a reply



Submit