Execution Order of Conditions in SQL 'Where' Clause

Execution order of conditions in SQL 'where' clause

Are you sure you "don't have the authority" to see an execution plan? What about using AUTOTRACE?

SQL> set autotrace on
SQL> select * from emp
2 join dept on dept.deptno = emp.deptno
3 where emp.ename like 'K%'
4 and dept.loc like 'l%'
5 /

no rows selected

Execution Plan
----------------------------------------------------------

----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 62 | 4 (0)|
| 1 | NESTED LOOPS | | 1 | 62 | 4 (0)|
|* 2 | TABLE ACCESS FULL | EMP | 1 | 42 | 3 (0)|
|* 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)|
|* 4 | INDEX UNIQUE SCAN | SYS_C0042912 | 1 | | 0 (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("EMP"."ENAME" LIKE 'K%' AND "EMP"."DEPTNO" IS NOT NULL)
3 - filter("DEPT"."LOC" LIKE 'l%')
4 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")

As you can see, that gives quite a lot of detail about how the query will be executed. It tells me that:

  • the condition "emp.ename like 'K%'" will be applied first, on the full scan of EMP
  • then the matching DEPT records will be selected via the index on dept.deptno (via the NESTED LOOPS method)
  • finally the filter "dept.loc like 'l%' will be applied.

This order of application has nothing to do with the way the predicates are ordered in the WHERE clause, as we can show with this re-ordered query:

SQL> select * from emp
2 join dept on dept.deptno = emp.deptno
3 where dept.loc like 'l%'
4 and emp.ename like 'K%';

no rows selected

Execution Plan
----------------------------------------------------------

----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 62 | 4 (0)|
| 1 | NESTED LOOPS | | 1 | 62 | 4 (0)|
|* 2 | TABLE ACCESS FULL | EMP | 1 | 42 | 3 (0)|
|* 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)|
|* 4 | INDEX UNIQUE SCAN | SYS_C0042912 | 1 | | 0 (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("EMP"."ENAME" LIKE 'K%' AND "EMP"."DEPTNO" IS NOT NULL)
3 - filter("DEPT"."LOC" LIKE 'l%')
4 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")

Does the order of where clauses matter in SQL?

No, that order doesn't matter (or at least: shouldn't matter).

Any decent query optimizer will look at all the parts of the WHERE clause and figure out the most efficient way to satisfy that query.

I know the SQL Server query optimizer will pick a suitable index - no matter which order you have your two conditions in. I assume other RDBMS will have similar strategies.

What does matter is whether or not you have a suitable index for this!

In the case of SQL Server, it will likely use an index if you have:

  • an index on (LastName, FirstName)
  • an index on (FirstName, LastName)
  • an index on just (LastName), or just (FirstName) (or both)

On the other hand - again for SQL Server - if you use SELECT * to grab all columns from a table, and the table is rather small, then there's a good chance the query optimizer will just do a table (or clustered index) scan instead of using an index (because the lookup into the full data page to get all other columns just gets too expensive very quickly).

SQL - Does the order of WHERE conditions matter?

No, the order of the WHERE clauses does not matter.

The optimizer reviews the query & determines the best means of getting the data based on indexes and such. Even if there were a covering index on the category_id and author columns - either would satisfy the criteria to use it (assuming there isn't something better).

Does the order of conditions in a WHERE clause affect MySQL performance?

No, the order should not make a large difference. When finding which rows match the condition, the condition as a whole (all of the sub-conditions combined via boolean logic) is examined for each row.

Some intelligent DB engines will attempt to guess which parts of the condition can be evaluated faster (for instance, things that don't use built-in functions) and evaluate those first, and more complex (estimatedly) elements get evaluated later. This is something determined by the DB engine though, not the SQL.

Execution order of WHEN clauses in a CASE statement

The value that is returned will be the value of the THEN expression for the earliest WHEN clause (textually) that matches. That does mean that if your line 2 conditions are met, the result will be A2.

But, if your THEN expressions were more complex than just literal values, some of the work to evaluate those expressions may happen even when that expression is not required.

E.g.

 WHEN r.code= '00'                        then 'A1'
WHEN r.code ='01' AND r.source = 'PXWeb' then 'A2'
WHEN r.code ='0120' then 1/0
WHEN r.code ='01' then 'A4'

could generate a division by zero error even if r.code isn't equal to 0120, and even if it's equal to 00, say. I don't know what the standard has to say on this particular issue but I know that it is true of some products.

Which performs first WHERE clause or JOIN clause

The conceptual order of query processing is:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY

But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:

CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))

;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))

INSERT INTO test1(name) SELECT 'a' FROM cte

Now run 2 queries:

SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1

SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1

Notice that the first query will filter most rows out in the join condition, but the second query filters in the where condition. Look at the produced plans:

1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)

2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)

This means that in the first query optimized, the engine decided first to evaluate the join condition to filter out rows. In the second query, it evaluated the where clause first.

SQL Query execution sequence in WHERE clause

The answer depends on the indexes that you make available to the SQL engine. If userid is indexed but username is not, the engine is likely to start with userid; if username is indexed but userid is not, then the engine would probably start with the name; if both fields are indexed, the engine will search that index, and look up the rows by internal ids. Note that all of this is highly dependent on the RDBMS that you are using. Some engines would forego index searches altogether in favor of full table scans when the number of rows is low.



Related Topics



Leave a reply



Submit