Help with Writing a SQL Query for Nested Sets

Writing Nested SQL queries

Something like this? Lines #1 - 26 represent sample data (you don't type that); query you need begins at line #28.

SQL> with
2 -- sample data
3 studentdata (studentid, name, course) as
4 (select 1, 'olivier', 'it' from dual union all
5 select 2, 'noah', 'business' from dual union all
6 select 3, 'jack', 'business' from dual union all
7 select 4, 'mason', 'it' from dual union all
8 select 5, 'julion', 'it' from dual),
9 subjects (subjectid, name) as
10 (select 1, 'java' from dual union all
11 select 2, 'business stg' from dual union all
12 select 3, 'python' from dual union all
13 select 4, 'statistics' from dual union all
14 select 5, 'mgt accounting' from dual union all
15 select 7, 'social studies' from dual union all
16 select 8, 'ess english' from dual),
17 marks (id, studentid, subjectid, marks) as
18 (select 1, 1, 1, 56 from dual union all
19 select 2, 1, 2, 78 from dual union all
20 select 3, 1, 7, 83 from dual union all
21 select 4, 1, 3, 45 from dual union all
22 select 5, 1, 5, 63 from dual union all
23 --
24 select 6, 2, 1, 99 from dual union all
25 select 7, 3, 2, 10 from dual union all
26 select 8, 4, 7, 83 from dual)
27 --
28 select b.name subject, s.name student, m.marks
29 from marks m join subjects b on b.subjectid = m.subjectid
30 join studentdata s on s.studentid = m.studentid
31 where m.marks = (select max(m1.marks)
32 from marks m1
33 where m1.subjectid = m.subjectid
34 )
35 order by b.name, s.name;

SUBJECT STUDENT MARKS
-------------- ------- ----------
business stg olivier 78
java noah 99
mgt accounting olivier 63
python olivier 45
social studies mason 83
social studies olivier 83

6 rows selected.

SQL>

SQL - nested set model - how do I find the deepest parents (not leaf nodes) of each branch

You're right, the first step is to identify the leaf nodes through their property right equals left + 1. Next step is to find all parent nodes for those leaf nodes: these have a left smaller than that of the leaf and right larger than that of the leaf. And the final step is to exclude all parents but the one that has the largest left value (i.e. is closest to the leaf node).

An example in T-SQL, where l is the leaf node, p1 is the direct parent we're looking for and p2 is used to weed out all the non-direct parents.

First set up a test table with your example data in it:

if object_id('tempdb..#nsm') is not null
drop table #nsm;

select v.node, v.parent, v.lft, v.rgt
into #nsm
from (
values
('A' , NULL, 1, 16),
('B' , 'A' , 2, 15),
('E' , 'B' , 3, 8),
('F' , 'E' , 4, 5),
('H' , 'E' , 6, 7),
('C' , 'B' , 9, 14),
('D' , 'C' , 10, 11),
('G' , 'C' , 12, 13)
) v (node, parent, lft, rgt)

And here's the query that retrieves the both nodes C and E you requested:

select p1.node, p1.parent, p1.lft, p1.rgt
from #nsm p1
where exists (
select *
from #nsm l
where l.rgt = l.lft + 1
and p1.lft < l.lft
and p1.rgt > l.rgt
and not exists (
select *
from #nsm p2
where p2.lft < l.lft
and p2.rgt > l.rgt
and p2.lft > p1.lft
)
)

Nested Set Query to retrieve all ancestors of each node

First it's important to understand that you have an implicit GROUP BY

If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.

To make the point more understandable I'll leave out subqueries and reduce the problem to the banana. Banana is the set [10, 11]. The correct sorted ancestors are those:

SELECT "banana" as node, GROUP_CONCAT(title ORDER by `left`)
FROM Tree WHERE `left` < 10 AND `right` > 11
GROUP BY node;

The ORDER BY must be in GROUP_CONCAT() as you want the aggregation function to sort. ORDER BY outside sorts by the aggregated results (i.e. the result of GROUP_CONCAT()). The fact that it worked until level 4 is just luck. ORDER BY has no effect on an aggregate function. You would get the same results with or without the ORDER BY:

SELECT GROUP_CONCAT(title)
FROM Tree WHERE `left` < 10 AND `right` > 11
/* ORDER BY `left` */

It might help to understand what
SELECT GROUP_CONCAT(title ORDER BY left) FROM Tree WHERE … ORDER BY left does:

  1. Get a selection (WHERE) which results in three rows in an undefined order:


    ("Food")
    ("Yellow")
    ("Fruit")
  2. Aggregate the result into one row (implicit GROUP BY) in order to be able to use an aggregate function:


    (("Food","Yellow", "Fruit"))
  3. Fire the aggregate function (GROUP_CONCAT(title, ORDER BY link)) on it. I.e. order by link and then concatenate:


    ("Food,Fruit,Yellow")
  4. And now finally it sorts that result (ORDER BY). As it's only one row, sorting changes nothing.


    ("Food,Fruit,Yellow")

Nested Set SQL Query to retrieve the first ancestor of each node

As I understand your question, you want to display the level 1 ancestor only. Starting from your existing query, you could just use conditional aggregation:

MAX(CASE WHEN p.level = 1 THEN p.sid END) as l1_ancestor

This is a bit closer to what you ask for, and would produce the result that you want:

CASE WHEN n.level = 1 THEN n.sid ELSE MAX(CASE WHEN p.level = 1 THEN p.sid END) END res l1_ancestor

Demo on DB Fiddle:


sid | name | ancestors | l1_ancestor
--: | :--- | :-------- | ----------:
1 | Root | null | null
2 | A | 1 | 2
3 | B | 1,2 | 2
4 | C | 1,2 | 2
5 | D | 1,2,4 | 2
6 | E | 1 | 6
7 | F | 1,6 | 6
8 | G | 1,6,7 | 6
9 | H | 1,6,7 | 6
10 | I | 1,6,7,9 | 6
11 | J | 1 | 11
12 | K | 1,11 | 11

Materializing the path of Nested Set hierarchy in T-SQL

After your comments, I realized no need for the CTE... you already have the range keys.

Example

Select A.*
,Path = Replace(Path,'>','>')
From YourTable A
Cross Apply (
Select Path = Stuff((Select ' > ' +Name
From (
Select LeftAnchor,Name
From YourTable
Where A.LeftAnchor between LeftAnchor and RightAnchor
) B1
Order By LeftAnchor
For XML Path (''))
,1,6,'')
) B
Order By LeftAnchor

Returns

Sample Image

Searching a nested set

I don't believe there's a perfectly clean and efficient way to do this with nested sets. Storing a list of the node's ancestors in a denormalized column would provide this efficiently, but I don't suggest implementing it.

There is an ok'ish method though, which is 1 query and will conveniently hit the index you already have. You're looking at one join for each level of depth of the target node.

For your example foo-bar-baz

select c3.*

from categories c1

inner join categories c2 on c2.parent_id = c1.id AND c2.name = 'bar'

inner join categories c3 on c3.parent_id = c2.id AND c2.name = 'baz'

where c1.name = 'foo'

It's not the greatest, but it's probably your best bet unless you want to go to the effort of storing a bunch of denormalized information. It's fairly straight forward to generate the SQL in code as well.



Related Topics



Leave a reply



Submit