Recursive Select

Recursive select?

This should do it:

with recursive all_posts (id, parentid, root_id) as 
(
select t1.id,
t1.parent_forum_post_id as parentid,
t1.id as root_id
from forumposts t1
where t1.parent_forum_post_id is null

union all

select c1.id,
c1.parent_forum_post_id as parentid,
p.root_id
from forumposts c1
join all_posts p on p.id = c1.parent_forum_post_id
)
select root_id, count(*)
from all_posts
order by root_id;

You can change the "starting" point by modifying the condition where t1.parent_forum_post_id is null.

Recursive select in SQL

declare @T table(
Id int primary key,
Name nvarchar(255) not null,
ParentId int)

insert into @T values
(1, 'TestName1', NULL),
(2, 'TestName2', 1),
(3, 'TestName3', 2),
(4, 'TestName4', NULL),
(5, 'TestName5', 1)

declare @Id int = 1

;with cte as
(
select T.*
from @T as T
where T.Id = @Id
union all
select T.*
from @T as T
inner join cte as C
on T.ParentId = C.Id
)
select *
from cte

Result

Id          Name                 ParentId
----------- -------------------- -----------
1 TestName1 NULL
2 TestName2 1
5 TestName5 1
3 TestName3 2

Select Query with Recursion

You can use a Recursive CTE and then get the last level of "recursion" for each employee. Once you have that, you just check the manager_id of that last level to find out if it's transferred.

For example:

with
tablea as (
select 1 as uniqueId, 101 as employee_id, 102 as manager_id from dual union all
select 2 as uniqueId, 102 as employee_id, 103 as manager_id from dual union all
select 3 as uniqueId, 103 as employee_id, 104 as manager_id from dual union all
select 4 as uniqueId, 105 as employee_id, 106 as manager_id from dual union all
select 5 as uniqueId ,106 as employee_id, null from dual
),
tableb as (
select 101 as employee_id, 'first' as employee_name from dual union all
select 102 as employee_id, 'second' as employee_name from dual union all
select 103 as employee_id, 'third' as employee_name from dual union all
select 104 as employee_id, 'fourth' as employee_name from dual union all
select 105 as employee_id, 'fifth' as employee_name from dual union all
select 106 as employee_id, 'sixth' as employee_name from dual
),
n (employee_id, employee_name, lvl, manager_id) as (
select b.employee_id, b.employee_name, 1, a.manager_id
from tablea a
join tableb b on a.employee_id = b.employee_id
union all
select
n.employee_id, n.employee_name, lvl + 1, a.manager_id
from n
join tablea a on a.employee_id = n.manager_id
),
m (employee_id, max_lvl) as (
select employee_id, max(lvl) from n group by employee_id
)
select n.employee_id, n.employee_name,
case when n.manager_id is not null then 'True' else 'False' end as transferred
from n
join m on n.employee_id = m.employee_id and n.lvl = m.max_lvl
order by n.employee_id

Result:

EMPLOYEE_ID  EMPLOYEE_NAME  TRANSFERRED
----------- ------------- -----------
101 first True
102 second True
103 third True
105 fifth False
106 sixth False

How to select using WITH RECURSIVE clause

First of all, let us try to simplify and clarify algorithm description given on the manual page. To simplify it consider only union all in with recursive clause for now (and union later):

WITH RECURSIVE pseudo-entity-name(column-names) AS (
Initial-SELECT
UNION ALL
Recursive-SELECT using pseudo-entity-name
)
Outer-SELECT using pseudo-entity-name

To clarify it let us describe query execution process in pseudo code:

working-recordset = result of Initial-SELECT

append working-recordset to empty outer-recordset

while( working-recordset is not empty ) begin

new working-recordset = result of Recursive-SELECT
taking previous working-recordset as pseudo-entity-name

append working-recordset to outer-recordset

end

overall-result = result of Outer-SELECT
taking outer-recordset as pseudo-entity-name

Or even shorter - Database engine executes initial select, taking its result rows as working set. Then it repeatedly executes recursive select on the working set, each time replacing contents of the working set with query result obtained. This process ends when empty set is returned by recursive select. And all result rows given firstly by initial select and then by recursive select are gathered and feeded to outer select, which result becomes overall query result.

This query is calculating factorial of 3:

WITH RECURSIVE factorial(F,n) AS (
SELECT 1 F, 3 n
UNION ALL
SELECT F*n F, n-1 n from factorial where n>1
)
SELECT F from factorial where n=1

Initial select SELECT 1 F, 3 n gives us initial values: 3 for argument and 1 for function value.

Recursive select SELECT F*n F, n-1 n from factorial where n>1 states that every time we need to multiply last funcion value by last argument value and decrement argument value.

Database engine executes it like this:

First of all it executes initail select, which gives the initial state of working recordset:

F | n
--+--
1 | 3

Then it transforms working recordset with recursive query and obtain its second state:

F | n
--+--
3 | 2

Then third state:

F | n
--+--
6 | 1

In the third state there is no row which follows n>1 condition in recursive select, so forth working set is loop exits.

Outer recordset now holds all the rows, returned by initial and recursive select:

F | n
--+--
1 | 3
3 | 2
6 | 1

Outer select filters out all intermediate results from outer recordset, showing only final factorial value which becomes overall query result:

F 
--
6

And now let us consider table forest(id,parent_id,name):

id | parent_id | name
---+-----------+-----------------
1 | | item 1
2 | 1 | subitem 1.1
3 | 1 | subitem 1.2
4 | 1 | subitem 1.3
5 | 3 | subsubitem 1.2.1
6 | | item 2
7 | 6 | subitem 2.1
8 | | item 3

'Expanding full tree' here means sorting tree items in human-readable depth-first order while calculating their levels and (maybe) paths. Both tasks (of correct sorting and calculating level or path) are not solvable in one (or even any constant number of) SELECT without using WITH RECURSIVE clause (or Oracle CONNECT BY clause, which is not supported by PostgreSQL). But this recursive query does the job (well, almost does, see the note below):

WITH RECURSIVE fulltree(id,parent_id,level,name,path) AS (
SELECT id, parent_id, 1 as level, name, name||'' as path from forest where parent_id is null
UNION ALL
SELECT t.id, t.parent_id, ft.level+1 as level, t.name, ft.path||' / '||t.name as path
from forest t, fulltree ft where t.parent_id = ft.id
)
SELECT * from fulltree order by path

Database engine executes it like this:

Firstly, it executes initail select, which gives all highest level items (roots) from forest table:

id | parent_id | level | name             | path
---+-----------+-------+------------------+----------------------------------------
1 | | 1 | item 1 | item 1
8 | | 1 | item 3 | item 3
6 | | 1 | item 2 | item 2

Then, it executes recursive select, which gives all 2nd level items from forest table:

id | parent_id | level | name             | path
---+-----------+-------+------------------+----------------------------------------
2 | 1 | 2 | subitem 1.1 | item 1 / subitem 1.1
3 | 1 | 2 | subitem 1.2 | item 1 / subitem 1.2
4 | 1 | 2 | subitem 1.3 | item 1 / subitem 1.3
7 | 6 | 2 | subitem 2.1 | item 2 / subitem 2.1

Then, it executes recursive select again, retrieving 3d level items:

id | parent_id | level | name             | path
---+-----------+-------+------------------+----------------------------------------
5 | 3 | 3 | subsubitem 1.2.1 | item 1 / subitem 1.2 / subsubitem 1.2.1

And now it executes recursive select again, trying to retrieve 4th level items, but there are none of them, so the loop exits.

The outer SELECT sets the correct human-readable row order, sorting on path column:

id | parent_id | level | name             | path
---+-----------+-------+------------------+----------------------------------------
1 | | 1 | item 1 | item 1
2 | 1 | 2 | subitem 1.1 | item 1 / subitem 1.1
3 | 1 | 2 | subitem 1.2 | item 1 / subitem 1.2
5 | 3 | 3 | subsubitem 1.2.1 | item 1 / subitem 1.2 / subsubitem 1.2.1
4 | 1 | 2 | subitem 1.3 | item 1 / subitem 1.3
6 | | 1 | item 2 | item 2
7 | 6 | 2 | subitem 2.1 | item 2 / subitem 2.1
8 | | 1 | item 3 | item 3

NOTE: Resulting row order will remain correct only while there are no punctuation characters collation-preceeding / in the item names. If we rename Item 2 in Item 1 *, it will break row order, standing between Item 1 and its descendants.

More stable solution is using tab character (E'\t') as path separator in query (which can be substituted by more readable path separator later: in outer select, before displaing to human or etc). Tab separated paths will retain correct order until there are tabs or control characters in the item names - which easily can be checked and ruled out without loss of usability.

It is very simple to modify last query to expand any arbitrary subtree - you need only to substitute condition parent_id is null with perent_id=1 (for example). Note that this query variant will return all levels and paths relative to Item 1.

And now about typical mistakes. The most notable typical mistake specific to recursive queries is defining ill stop conditions in recursive select, which results in infinite looping.

For example, if we omit where n>1 condition in factorial sample above, execution of recursive select will never give an empty set (because we have no condition to filter out single row) and looping will continue infinitely.

That is the most probable reason why some of your queries hang (the other non-specific but still possible reason is very ineffective select, which executes in finite but very long time).

There are not much RECURSIVE-specific querying guidlines to mention, as far as I know. But I would like to suggest (rather obvious) step by step recursive query building procedure.

  • Separately build and debug your initial select.

  • Wrap it with scaffolding WITH RECURSIVE construct

    and begin building and debugging your recursive select.

The recommended scuffolding construct is like this:

WITH RECURSIVE rec( <Your column names> ) AS (
<Your ready and working initial SELECT>
UNION ALL
<Recursive SELECT that you are debugging now>
)
SELECT * from rec limit 1000

This simplest outer select will output the whole outer recordset, which, as we know, contains all output rows from initial select and every execution of recusrive select in a loop in their original output order - just like in samples above! The limit 1000 part will prevent hanging, replacing it with oversized output in which you will be able to see the missed stop point.

  • After debugging initial and recursive select build and debug your outer select.

And now the last thing to mention - the difference in using union instead of union all in with recursive clause. It introduces row uniqueness constraint which results in two extra lines in our execution pseudocode:

working-recordset = result of Initial-SELECT

discard duplicate rows from working-recordset /*union-specific*/

append working-recordset to empty outer-recordset

while( working-recordset is not empty ) begin

new working-recordset = result of Recursive-SELECT
taking previous working-recordset as pseudo-entity-name

discard duplicate rows and rows that have duplicates in outer-recordset
from working-recordset /*union-specific*/

append working-recordset to outer-recordset

end

overall-result = result of Outer-SELECT
taking outer-recordset as pseudo-entity-name

Recursive CTE confusion in looping of the recursive member

Every iteration of the recursive member will select the sub-employees of the previous one(s):

  • The anchor query retrieves all top level manager(s) who don't have any bosses above them.

  • Then, on iteration #1 the recursive query retrieves the immediate subordinates of the above, using the predicate Employees.ManagerID = EmployeesCTE.EmployeeID.

  • Then, on iteration #2 the recursive query only considers the extra rows retrieved in iteration #1 and applies the same predicate above to retrieve the subordinates of these.

  • Iteration #3 and so on, keep on considering only "newly added rows" every time, until no more rows are found in an iteration. When this happens the recursive CTE is complete and its execution ends.



Related Topics



Leave a reply



Submit