How to Create a Closure Table Using Data from an Adjacency List

How can I create a closure table using data from an adjacency list?

I think I've been able to work out the solution myself.

If anyone has a better way of doing this, please comment.

IF OBJECT_ID('dbo.ClosureTable', 'U') IS NOT NULL
DROP TABLE dbo.ClosureTable
GO

CREATE TABLE dbo.ClosureTable (
ancestor int NOT NULL,
descendant int NOT NULL,
distance int NULL
)
GO

DECLARE @depth INT
SET @depth = 1

INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
SELECT catid, catid, 0 FROM dbo.Category -- insert all the self-referencing nodes

WHILE (@depth < 4) -- my tree is only 4 levels deep, i.e 0 - 3
BEGIN
INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
SELECT ct.ancestor, h.catid, @depth
FROM dbo.ClosureTable ct INNER JOIN dbo.CategoryHierarchy h ON ct.descendant = h.parentid
WHERE ct.distance = @depth - 1

SET @depth = @depth + 1
END

Cheers :)

SQL query to find final overridden id in adjacency list / closure table

Solution I ended up going with, was to build a second closure table which represented the hierarchy of the overrides (so similar to building a closure based on parent_id, but instead using overrides_id). This allows me to then query to find the entire hierarchy of overrides for any given attribute.

I had hoped to find a query-to-end-them-all, but going with a second closure table was simpler, better on performance, etc.

MySql sorting hierarchical data in a closure table that has repeated nodes

I would suggest splitting the node id into two concepts. One would be a unique id that is used for the graph properties (i.e. ancestor_descendant list). The second is what you show on output.

  • 1
    • 2
      • 5
    • 3
      • 50
    • 4
      • 6
      • 20
        • 51

Then create a mapping table:

Id      Value
1 1
2 2
20 2
3 3
4 4
5 5
50 5
51 5
6 6

You can then get what you want by joining back to the mapping table and using the value column instead of the id column.

codeIgniter closure table model

I would recommend for you to pick up the SQL Antipatterns book. The second chapter feature closure tables as one of recommended ways for implementing category trees.

That said. Looks like your closure table is a bit strange. There is no point in id column there. Instead you should have a composite primary key, made from unique pairs of ancestor and descendant value.

And you have not inserted the nodes themselves .. only the connection between two different nodes. Maybe reading "Rendering Trees with Closure Tables" could shine some light on the subject.

At a guess the INSERT statement should look like this (at least that my conclusion):

INSERT INTO closures(ancestor, descendant, lvl) 
VALUES (1, 1, null),
(20, 20, null),
(26, 26, null),
(28, 28, null),
(1, 20, 1),
(20, 26, 2),
(26, 25, 3);

What you have to understand is that closure tables are not storing a tree. Instead, the data structure you are working with, is directional graph. Something like this:

Sample Image

As you can see, this graph has three root nodes: 3, 5 and 7. Also, it is very important to note, that node 10 is a different levels of depth, depending on from which root node you begin.

It would be defined with two closures: [3,10,1] and [11,10,2]. Meaning, that the connection from 11th node would put it a level two, while starting from 3rd node, it's a first level item.

Thing is, when you are using closure tables each category can have multiple parent categories each at different level of depth.


Addition (by @ypercube):

My understanding of the "level" or "depth" column is that it stores the "distance" (steps needed to go) from ancestor to descedant. It's not an absolute level of a node and thus closure tables can be used to store more complex than trees graphs. You may even have multiple paths from an ancestor to a descendant, each one (path) with different steps.

Additionally, the Nulls should be 0 and a few more rows are needed.

So, the data would be:

INSERT INTO closures(ancestor, descendant, lvl) 
VALUES ( 1, 1, 0), (20, 20, 0), (26, 26, 0), (25, 25, 0),
( 1, 20, 1), (20, 26, 1), (26, 25, 1),
( 1, 26, 2), (20, 25, 2),
( 1, 25, 3) ;

Managing hierarchies in SQL: MPTT/nested sets vs adjacency lists vs storing paths

You might also consider the Closure Table design I describe in my answer to What is the most efficient/elegant way to parse a flat table into a tree?

Calls required to create/delete/move a node:

  • Closure = 1

Calls required to get a tree:

  • Closure = 1

Calls required to get path to a node / ancestry:

  • Closure = 1

Calls required to get number of subnodes:

  • Closure = 1

Calls required to get depth of node:

  • Closure = 1

DB fields required:

  • Adjancency = 1 more field / row
  • Path = 1 more field / row
  • MPTT = 2 or 3 more fields / row
  • Closure = 2 or 3 fields in extra table. This table has O(n^2) rows worst case but far fewer than that in most practical cases.

There are a couple of other considerations:

Supports unlimited depth:

  • Adjacency = yes
  • MPTT = yes
  • Path = no
  • Closure = yes

Supports referential integrity:

  • Adjacency = yes
  • MPTT = no
  • Path = no
  • Closure = yes

I also cover Closure Table in my presentation Models for Hierarchical Data with SQL and PHP, and my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.



Related Topics



Leave a reply



Submit