Case and Coalesce Short-Circuit Evaluation Works with Sequences in Pl/SQL But Not in SQL

CASE and COALESCE short-circuit evaluation works with sequences in PL/SQL but not in SQL

Explanation of why the short-circuit evaluation does not apply to sequences might be the following. What is a sequence? Putting internals aside, it's a combination of sequence definition(record in seq$ data dictionary table) and some internal SGA component, it's not a function and might be considered, although the documentation does not state it directly(but execution plan does) as row source. And every time a sequence is being referenced directly in the select list of a query, it has to be evaluated by the optimizer when it searches for optimal execution plan. During the process of forming an optimal execution plan a sequence gets incremented if nextval pseudocolumn is referenced:

SQL> create sequence seq1;
Sequence created

Here is our sequence:

SQL> select o.obj#
2 , o.name
3 , s.increment$
4 , s.minvalue
5 , s.maxvalue
6 , s.cache
7 from sys.seq$ s
8 join sys.obj$ o
9 on (o.obj# = s.obj#)
10 where o.name = 'SEQ1'
11 ;

OBJ# NAME INCREMENT$ MINVALUE MAXVALUE CACHE
---------- ------- ---------- ---------- ---------- ----------
94442 SEQ1 1 1 1E28 20

Lets trace below query, and also take a look at its execution plan

SQL> ALTER SESSION SET EVENTS '10046 trace name context forever, level 4';
Session altered

SQL> select case
2 when 1 = 1 then 1
3 when 2 = 1 then seq1.nextval
4 end as res
5 from dual;

RES
----------
1

/* sequence got incremented by 1 */

SQL> select seq1.currval from dual;

CURRVAL
----------
3

Trace file information:

STAT #1016171528 id=1 cnt=1 pid=0 pos=1 obj=94442 op='SEQUENCE SEQ1 ...

STAT #1016171528 id=2 cnt=1 pid=1 pos=1 obj=0 op='FAST DUAL ...

CLOSE #1016171528:c=0,e=12,dep=0,type=0,tim=12896600071500 /* close the cursor */

The execution plan will show us basically the same:

SQL> explain plan for select case
2 when 1 = 1 then 1
3 else seq1.nextval
4 end
5 from dual
6 /
Explained
Executed in 0 seconds

SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
---------------------------------------------------------------
Plan hash value: 51561390
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SEQUENCE | SEQ1 | | | |
| 2 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
9 rows selected
Executed in 0.172 seconds

In terms of evaluation, referencing a sequence directly in a query, roughly the same as including a correlated sub-query. That correlated sub-query will always be evaluated by the optimizer:

SQL> explain plan for select case
2 when 1 = 1 then 1
3 when 2 = 1 then (select 1
4 from dual)
5 end as res
6 from dual;
Explained
Executed in 0 seconds

SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1317351201
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 (0)| 00:00:01 |
| 1 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
| 2 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
9 rows selected
Executed in 0.063 seconds

We can see that dual table has been included in the execution plan twice.

The analogy with a sub-query was made in a rush. There are more differences than similarities, of course. Sequences are absolutely different mechanisms. But, a sequences are viewed by the optimizer as a row source, and as long as it doesn't see the nextval pseudocolumn of a sequence being directly referenced in the select list of a top-level query, it won't evaluate the sequence, otherwise sequence will be incremented, whether a short-circuit evaluation logic is being used or not. PL/SQL engine,obviously, (starting from Oracle 11g r1) has a different way to access a sequence value. Should be noted that in previous 11gR1 versions of RDBMS we should write a query to reference a sequence in PL/SQL block, which PL/SQL engine sent directly to the SQL engine.

The answer to the "why a sequence gets incremented during generating an execution plan by the optimizer" question, lies in the internal implementation of sequences.

Oracle CASE short-circuit not working in group by

Use

CASE WHEN cond = 2 
THEN SUM(case when div = 0 then 0 else num / div end)
ELSE -1
END

COALESCE - guaranteed to short-circuit?

I just had a look at the linked article and can confirm short circuiting can fail for both COALESCE and ISNULL.

It seems to fail if you have any sub-query involved, but it works fine for scalar functions and hard coded values.

For example,

DECLARE @test INT
SET @test = 1
PRINT 'test2'
SET @test = COALESCE(@test, (SELECT COUNT(*) FROM sysobjects))
SELECT 'test2', @test
-- OUCH, a scan through sysobjects

COALESCE is implemented according to the ANSI standard. It is simply a shorthand for a CASE statement. ISNULL is not part of the ANSI standard. Section 6.9 does not seem to require short circuiting explicitly, but it does imply that the first true clause in the when statement should be returned.

Here is some proof that is works for scalar based functions (I ran it on SQL Server 2005):

CREATE FUNCTION dbo.evil
(
)
RETURNS int
AS
BEGIN
-- Create an huge delay
declare @c int
select @c = count(*) from sysobjects a
join sysobjects b on 1=1
join sysobjects c on 1=1
join sysobjects d on 1=1
join sysobjects e on 1=1
join sysobjects f on 1=1
return @c / 0
END
go

select dbo.evil()
-- takes forever

select ISNULL(1, dbo.evil())
-- very fast

select COALESCE(1, dbo.evil())
-- very fast

Here is some proof that the underlying implementation with CASE will execute sub queries.

DECLARE @test INT
SET @test = 1
select
case
when @test is not null then @test
when @test = 2 then (SELECT COUNT(*) FROM sysobjects)
when 1=0 then (SELECT COUNT(*) FROM sysobjects)
else (SELECT COUNT(*) FROM sysobjects)
end
-- OUCH, two table scans. If 1=0, it does not result in a table scan.

CASE expression that does NOT have short-circuit evaluation?

If you are indeed just adding 1 to counting_column for each condition that is met, you can use a chain of n CASE statements and add the results together. Each individual CASE supplies a 0 or 1, all of which are added together and added to the current value of counting_column.

UPDATE [table] SET [counting_column] =
(
[counting_column] +
(CASE when [column1] IS NULL then 1 ELSE 0 END) +
(CASE when [column2] IS NULL then 1 ELSE 0 END) +
...
...
(CASE when [column30] IS NULL then 1 ELSE 0 END) +
(CASE when [column31] IS NULL then 1 ELSE 0 END)
)

Note that I have changed the = NULL to IS NULL. NULL is a special non-value that cannot be compared with an equality =.

I'll also point out that although [] enclosed identifiers are ok in SQLite, this is mostly for compatibility with T-SQL. The more standard approach is to use double-quotes as in UPDATE "table" SET "counting_column" = ...

How to prevent Oracle to use short circuit in PL/SQL

Try:

declare
valid boolean;
con1 boolean;
con2 boolean;
con3 boolean;
function ensure(b boolean, failure_message varchar) return boolean is begin
if not b then
dbms_output.put_line(failure_message);
return false;
else
return true;
end if;
end ensure;
begin
valid := true;
con1 := ensure(1=1, 'condition 1 failed') ;
con2 := ensure(1=2, 'condition 2 failed') ;
con3 := ensure(2=3, 'condition 3 failed');
valid := con1 AND con2 AND con3;
if not valid then
dbms_output.put_line('some conditions failed, terminate the program');
return;
end if;

dbms_output.put_line('do the work');
end;
/

Is the SQL WHERE clause short-circuit evaluated?

ANSI SQL Draft 2003 5WD-01-Framework-2003-09.pdf

6.3.3.3 Rule evaluation order

[...]

Where the precedence is not determined by the Formats or by
parentheses, effective evaluation of expressions is generally
performed from left to right. However, it is
implementation-dependent whether expressions are actually evaluated left to right, particularly when operands or operators might
cause conditions to be raised or if the results of the expressions
can be determined without completely evaluating all parts of the
expression.

SQLServer CASE expressions - short circuit evaluation?

Is that an actual or estimated plan? Sql Server builds plans based on what it expects to do based on collected statistics, and that doesn't always jibe with what specific conditions you send it for one instance of a query run.

LISTAGG is evaluated and fails in an unreachable case statement

The final evaluation of the select list columns/expressions, including the short-circuited case expression, happens after the data has been retrieved. Any grouping etc. has already been done by that point.

This effect doesn't only happen with listagg(), it can be seen with any aggregate or analytic function call in the return expression - though it's hard to spot unless there are side effects.

As a demo I created a simple package that has a function I can call from the the query:

create package p as
n number := 0;
function f return number;
end;
/

create package body p as
function f return number as
begin
n := n + 1;
return n;
end;
end;
/

This is essentially emulating a session-specific sequence; sequences also demonstrate this behaviour, but appearently for a different reason so I didn't want to use one for this.

Calling that function inside the case expression does what you'd expect; it is only called when the condition is matched:

select dep,
case
when dep = 'BAR' then
p.f
else
-1
end as id_list
from emp;

DEP ID_LIST
---------- -------
FOO -1
...
BAR 1
BAR 2
BAR 3
FOO -1

select p.f from dual;

F
----------
4

The function was only called when the condition was matched. The execution plan for that shows just a full table scan:

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 91 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMP | 13 | 91 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

With an aggregate call instead:

select dep,
case
when dep = 'BAR' then
count(p.f)
else
-1
end as id_list
from emp
group by dep;

DEP ID_LIST
---------- -------
FOO -1
BAR 3

select p.f from dual;

F
----------
18

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 91 | 4 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 13 | 91 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 13 | 91 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------

... the function was called 13 times instead of 3; the plan shows the hash group by step, which has to have happened across all of the retrieved rows before the case is evaluated.

Similarly for an analytic version:

select dep,
case
when dep = 'BAR' then
count(p.f) over (partition by dep)
else
-1
end as id_list
from emp;

DEP ID_LIST
---------- -------
BAR 3
BAR 3
BAR 3
FOO -1
...

select p.f from dual;

F
----------
32

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 91 | 4 (25)| 00:00:01 |
| 1 | WINDOW SORT | | 13 | 91 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 13 | 91 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------

... again the function was called 13 times, as the window sort (and hence the analytic calculation) was done before the case expression could be evaluated.

So the issue is not really that the return expression (listagg() in your case) is being evaluated within the case expression when it shouldn't be; it's being evaluated and throwing the exception before the case expression conditions are even considered.



Related Topics



Leave a reply



Submit