﻿ Recursive Query Used for Transitive Closure - ITCodar

# Recursive Query Used for Transitive Closure

## Recursive query used for transitive closure

You can simplify in several places (assuming `acct_id` and `parent_id` are `NOT NULL`):

``WITH RECURSIVE search_graph AS (   SELECT parent_id, ARRAY[acct_id] AS path   FROM   account   UNION  ALL   SELECT g.parent_id, sg.path || g.acct_id   FROM   search_graph sg   JOIN   account g ON g.acct_id = sg.parent_id    WHERE  g.acct_id <> ALL(sg.path)   )SELECT path[1] AS child     , path[array_upper(path,1)] AS parent     , pathFROM   search_graphORDER  BY path;``
• The columns `acct_id`, `depth`, `cycle` are just noise in your query.
• The `WHERE` condition has to exit the recursion one step earlier, before the duplicate entry from the top node is in the result. That was an "off-by-one" in your original.

The rest is formatting.

If you know the only possible circle in your graph is a self-reference, we can have that cheaper:

``WITH RECURSIVE search_graph AS (   SELECT parent_id, ARRAY[acct_id] AS path, acct_id <> parent_id AS keep_going   FROM   account   UNION  ALL   SELECT g.parent_id, sg.path || g.acct_id, g.acct_id <> g.parent_id   FROM   search_graph sg   JOIN   account g ON g.acct_id = sg.parent_id    WHERE  sg.keep_going)SELECT path[1] AS child     , path[array_upper(path,1)] AS parent     , pathFROM   search_graphORDER  BY path;``

SQL Fiddle.

Note there would be problems (at least up to pg v9.4) for data types with a modifier (like `varchar(5)`) because array concatenation loses the modifier but the rCTE insists on types matching exactly:

• Surprising results for data types with type modifier

## Recursive query challenge - simple parent/child example

With help from RhodiumToad on #postgresql, I've arrived at this solution:

``WITH RECURSIVE node_graph AS (    SELECT ancestor_node_id as path_start, descendant_node_id as path_end,           array[ancestor_node_id, descendant_node_id] as path     FROM node_relations    UNION ALL     SELECT ng.path_start, nr.descendant_node_id as path_end,           ng.path || nr.descendant_node_id as path    FROM node_graph ng    JOIN node_relations nr ON ng.path_end = nr.ancestor_node_id) SELECT * from node_graph order by path_start, array_length(path,1);``

The result is exactly as expected.

## Transitive-Closure Table Restructure

Yes, if you want to store the transitive closure, you need to store all paths.

For some operations, it's even helpful to store the path of length 0:

``| component | component_of | depth ||-----------|--------------|-------||  D        | D            | 0     ||  D        | A            | 1     ||  D        | B            | 2     ||  C        | C            | 0     ||  B        | B            | 0     ||  B        | A            | 1     ||  A        | A            | 0     |``

In MySQL 8.0, none of this will be needed. We'll finally be able to use recursive queries.

## How to CONSTRUCT a property's transitive closure in SPARQL?

So I tried to make subquery like this: …

I'm not sure what subquery syntax you're trying to use, but it's not legal. Subqueries are just wrapped in braces. E.g.,

``select ?s where {  { select ?x where { ... } }}``

The query you've written isn't legal.

But I obtained an error.

If you don't show us the error, we certainly can't help in fixing it. I expect that you got a syntax error. You can validate your query using sparql.org's query validator. That probably won't help too much in this case, though, because you can only use select queries in subqueries. (It would be very helpful, though, if construct queries were supported for subqueries.)

#### Transitive closures for construct queries

In general, you can't do arbitrary recursion in SPARQL. However, in the specific case that you've got, you can use property paths in the pattern to construct the transitive closure of a pattern. E.g.,

``construct { ?a :partOf ?b }where { ?a :partOf+ ?b }``

This says that if there's a path of non-zero length from ?a to ?c using just :partOf links, then include a triple ?a :partOf ?b in the constructed output.

## hierarchical data in a database: recursive query vs. closure tables vs. graph database

The whole closure table is redundant if you can use recursive queries :)

I think it's much better to have a complicated recursive query that you have to figure out once than deal with the extra IO (and disk space) of a separate table and associated triggers.

I have done some simple tests with recursive queries in postgres. With a few million rows in the table queries were still < 10ms for returning all parents of a particular child. Returning all children was fast too, depending on the level of the parent. It seemed to depend more on disk IO fetching the rows rather than the query speed itself. This was done single user, so not sure how it would perform under load. I suspect it would be very fast still if you can also hold most of the table in memory (and setup postgres correctly). Clustering the table by parent id also seemed to help.

## detect cycles in a graph in SQL using recursive common table expression

``CREATE TABLE #myEdge (    ID INT IDENTITY(1,1) PRIMARY KEY,    NodeIDFrom INT,    NodeIDTo INT)INSERT INTO #myEdge(NodeIDFrom, NodeIDTo) VALUES (4, 5),(5, 6),(6, 4);DECLARE @rootNode AS integer = 4;--compute the transitive closure from this root. column Niveau holds the recursion nesting level.WITH cte_transitive_closure(rootNode, NodeFrom, NodeTo, Niveau)AS (    SELECT @rootNode, NULL, @rootNode, 0    UNION ALL    SELECT rootNode, e.NodeIDFrom, e.NodeIDTo, Niveau+1    FROM cte_transitive_closure AS cte    JOIN #myEdge AS e ON cte.NodeTo=e.NodeIdFrom    WHERE Niveau < 99)SELECT *INTO #transitive_closureFROM cte_transitive_closure;SELECT * FROM #transitive_closure;--starting from root as a reached target, move back in the trace until the root is hit again.WITH cte_cycle(NodeFrom, NodeTo, Cycle)AS (    SELECT @rootNode,NULL,0    UNION ALL    SELECT t.NodeFrom, t.NodeTo, 0    FROM cte_cycle AS cte    JOIN #transitive_closure AS t ON cte.NodeFrom = t.NodeTo    WHERE t.NodeFrom != @rootNode AND Cycle=0    UNION ALL    SELECT t.NodeFrom, t.NodeTo, 1    FROM cte_cycle AS cte    JOIN #transitive_closure AS t ON cte.NodeFrom = t.NodeTo    WHERE t.NodeFrom = @rootNode AND Cycle=0)SELECT DISTINCT * FROM cte_cycle;result set:NodeFrom    -> NodeTo   (Cycle)4   -> NULL (0)4   -> 5    (1)5   -> 6    (0)6   -> 4    (0)``