Building a Table Dependency Graph with a Recursive Query

Building a Table Dependency Graph With A Recursive Query

    select parent, child, level from (
select parent_table.table_name parent, child_table.table_name child
 from user_tables      parent_table,
      user_constraints parent_constraint,
      user_constraints child_constraint,
      user_tables      child_table
where parent_table.table_name = parent_constraint.table_name
  and parent_constraint.constraint_type IN( 'P', 'U' )
  and child_constraint.r_constraint_name = parent_constraint.constraint_name
  and child_constraint.constraint_type   = 'R'
  and child_table.table_name = child_constraint.table_name
  and child_table.table_name != parent_table.table_name
)
start with parent = 'DEPT'
connect by prior child = parent

should work (replace the table name, of course) assuming that everything is in the same schema. Use the DBA_ versions of the data dictionary tables and conditions for the OWNER and R_OWNER columns if you need to handle cross-schema dependencies. On further reflection, this does not account for self-referential constraints (i.e. a constraint on the EMP table that the MGR column references the EMPNO column) either, so you'd have to modify the code to handle that case if you need to deal with self-referential constraints.

For testing purposes, I added a few new tables to the SCOTT schema that also reference the DEPT table (including a grandchild dependency)

SQL> create table dept_child2 (
  2  deptno number references dept( deptno )
  3  );

Table created.

SQL> create table dept_child3 (
  2    dept_child3_no number primary key,
  3    deptno number references dept( deptno )
  4  );

Table created.

SQL> create table dept_grandchild (
  2    dept_child3_no number references dept_child3( dept_child3_no )
  3  );

Table created.

and verified that the query returned the expected output

SQL> ed
Wrote file afiedt.buf

  1  select parent, child, level from (
  2  select parent_table.table_name parent, child_table.table_name child
  3   from user_tables      parent_table,
  4        user_constraints parent_constraint,
  5        user_constraints child_constraint,
  6        user_tables      child_table
  7  where parent_table.table_name = parent_constraint.table_name
  8    and parent_constraint.constraint_type IN( 'P', 'U' )
  9    and child_constraint.r_constraint_name = parent_constraint.constraint_name
 10    and child_constraint.constraint_type   = 'R'
 11    and child_table.table_name = child_constraint.table_name
 12    and child_table.table_name != parent_table.table_name
 13  )
 14  start with parent = 'DEPT'
 15* connect by prior child = parent
SQL> /

PARENT                         CHILD                               LEVEL
------------------------------ ------------------------------ ----------
DEPT                           DEPT_CHILD3                             1
DEPT_CHILD3                    DEPT_GRANDCHILD                         2
DEPT                           DEPT_CHILD2                             1
DEPT                           EMP                                     1

Recursive query for table dependencies is not recursing not as much as I'd like

You want something like this:

select t.table_name, level,lpad(' ', 2 * (level - 1))||t.table_name 
from user_tables t
join user_constraints c1 
    on (t.table_name = c1.table_name 
    and c1.constraint_type in ('U', 'P'))
left join user_constraints c2 
    on (t.table_name = c2.table_name 
    and c2.constraint_type='R')
start with t.table_name = 'ROOT_TAB'
connect by prior c1.constraint_name = c2.r_constraint_name

The problem with the original query is that uc.constraint_name for the child table is the name of the foreign key. That is fine for connecting the first child to the root table, but it is not what you need to connect the children on the second level to the first. That is why you need to join against the constraints twice -- once to get the table's primary key, once to get the foreign keys.

As an aside, if you are going to be querying the all_* views rather than the user_* views, you generally want to join them on table_name AND owner, not just table_name. If multiple schemas have tables with the same name, joining on just table_name will give incorrect results.

Directed graph in Oracle SQL using recursive query visiting each node only once

For keeping the traversing algorithm out of returning to already visited edges, one can indeed keep the visited edges somewhere. As you already found out, you won't get much of a success with a string concatenation. However, there are other usable "value concatenation" techniques available...

You have to have one handy schema-level collection of scalars at your disposal:

create or replace type arr_strings is table of varchar2(64);

And then you can collect the visited edges to that collection in each iteration:

with nondirected$ as (
    select from_id, to_id, from_id||'-'||to_id as edge_desc
    from edge
    where from_id != to_id
    union all
    select to_id, from_id, from_id||'-'||to_id as edge_desc
    from edge
    where (to_id, from_id) not in (
            select from_id, to_id
            from edge
        )
),
graph$(lvl, from_id, to_id, edge_desc, visited_edges) as (
    select 1, from_id, to_id, edge_desc,
        arr_strings(edge_desc)
    from nondirected$ R
    where from_id in (&nodes)
    --
    union all
    --
    select
        lvl+1,
        Y.from_id, Y.to_id, Y.edge_desc,
        X.visited_edges multiset union arr_strings(Y.edge_desc)
    from graph$ X
        join nondirected$ Y
            on Y.from_id = X.to_id
    where not exists (
            select 1
            from table(X.visited_edges) Z
            where Y.edge_desc = Z.column_value
        )
)
search breadth first by edge_desc set order_id
    cycle edge_desc set is_cycle to 1 default 0,
ranked_graph$ as (
    select C.*,
        row_number() over (partition by edge_desc order by lvl, order_id) as rank$
    from graph$ C
--    where is_cycle = 0
)
select *
from ranked_graph$
--where rank$ <= 1
order by lvl, order_id
;

Notes

I pre-processed the directed graph to a nondirected one by union-ing a set of reverse edges to the input. That should make the recursive traversal predicates easier to read. Solely for my purposes of easier reading+writing of the SQL. You don't have to do that, of course.
I remember trying something like this a few years ago on Oracle 11.2. And I remember that it was failing, though I don't remember why. On 12.2, it ran OK. Just give it a try on 11g, too; I don't have one available.
Since each iteration does, in addition to the traversal inner join, also an anti-join, I sincerely doubt that this is going to be more performant. However, it for sure solves the problem of lowering the number of recursive nestings.
You'll have to resolve the desired ordering on your own, as you probably understood from my comments. :-)

Limiting the revisited edges to zero

In SQL, you can't. The PostgreSQL solution you mentioned does do it. In Oracle, however, you can't. You would have to, for each traversal join, test rows for all other traversal joins. And that would mean some kind of aggregation or analytics... which Oracle forbids and throws out an ORA exception.

PLSQL to the rescue?

You can do it in PL/SQL, though. How much performant it is supposed to be, depends on how much memory you want to spend on prefetching edges from DB vs. how many SQL roundtrips you are willing to take to traverse the graph from "current" nodes or if you are willing to use even more memory to keep the visited nodes in a fancy indexed-by-edge collection vs. if you rather anti-join against a regular arr_output collection l_visited_nodes. You have multiple choices, choose wisely.

Anyway, for the simplest scenario with heavier use of SQL engine, this might be the code you're looking for...

create or replace
package pkg_so_recursive_traversal
is

type rec_output                     is record (
    from_id                             edge.from_id%type,
    to_id                               edge.to_id%type,
    lvl                                 integer
);
type arr_output                     is table of rec_output;

function traverse_a_graph
    ( i_from                        in arr_strings
    , i_is_directed                 in varchar2 default 'NO' )
    return arr_output
    pipelined;

end pkg_so_recursive_traversal;
/
create or replace
package body pkg_so_recursive_traversal
is

function traverse_a_graph
    ( i_from                        in arr_strings
    , i_is_directed                 in varchar2 )
    return arr_output
    pipelined
is
    l_next_edges                    arr_output;
    l_current_edges                 arr_output;
    l_visited_edges                 arr_output := arr_output();
    l_out                           rec_output;
    i                               pls_integer;
    l_is_directed                   varchar2(32) := case when i_is_directed = 'YES' then 'YES' else 'NO' end;
begin
    select E.from_id, E.to_id, 0
    bulk collect into l_next_edges
    from table(i_from) F
        join edge E
            on F.column_value in (E.from_id, case when l_is_directed = 'YES' then null else E.to_id end)
    where E.from_id != E.to_id;

    l_out.lvl := 0;

    loop
        dbms_output.put_line(l_next_edges.count());
        exit when l_next_edges.count() <= 0;
        l_out.lvl := l_out.lvl + 1;

        -- spool the edges to output
        i := l_next_edges.first();
        while i is not null loop
            l_out.from_id := l_next_edges(i).from_id;
            l_out.to_id := l_next_edges(i).to_id;
            pipe row(l_out);
            i := l_next_edges.next(i);
        end loop;

        l_current_edges := l_next_edges;
        l_visited_edges := l_visited_edges multiset union l_current_edges;

        -- find next edges
        select unique E.from_id, E.to_id, 0
        bulk collect into l_next_edges
        from table(l_current_edges) CE
            join edge E
                on CE.to_id in (E.from_id, case when l_is_directed = 'YES' then null else E.to_id end)
                or l_is_directed = 'NO' and CE.from_id in (E.from_id, E.to_id)
        where E.from_id != E.to_id
            and not exists (
                select 1
                from table(l_visited_edges) VE
                where VE.from_id = E.from_id
                    and VE.to_id = E.to_id
            );
    end loop;

    return;
end;

end pkg_so_recursive_traversal;

/

When called for the starting node of A and considering the graph to be undirected...

select *
from table(pkg_so_recursive_traversal.traverse_a_graph(
        i_from => arr_strings('A'),
        i_is_directed => 'NO'
    ));

... it yields...

FROM_ID    TO_ID             LVL
---------- ---------- ----------
A          B                   1
C          A                   1
C          E                   2
B          D                   2
C          F                   2
E          B                   2
G          D                   3
H          F                   3
G          I                   4

Notes

Again, I did not put any effort to keep the ordering you requested, as you said it's not that important.
This is doing multiple (exactly 5 for your example inputs) SQL roundtrips to the edge table. That may or may not be a bigger performance hit when compared to the pure-SQL solution with redundant edge visiting. Test more solutions properly, see which one works for you the best.
This particular piece of code will work on 12c and higher. For 11g and lower you'll have to declare the rec_output and arr_output types on the schema level.

Query to Recursively Identify Object Dependencies

I saw this post to identify all the objects that reference a particular synonym and used the base logic in the answer in a recursive CTE to identify all the objects related to a comma-delimited list of the objects within the top level query being executed.

Declare @baseObjects Nvarchar(1000) = '[Schema].[Table],[Schema].[View],[Schema].[Function],[Schema].[StoredProc]',
        @SQL Nvarchar(Max);

Declare @objects Table (SchemaName Varchar(512), TableName Varchar(512), ID Int, xtype Varchar(10));

Set     @SQL = 'Select  ss.name As SchemaName,
                        so.name As TableName,
                        so.id,
                        so.xtype
                From    sysobjects so
                Join    sys.schemas ss
                        On  so.uid = ss.schema_id
                Where   so.id In (Object_ID(''' + Replace(@baseObjects,',','''),Object_ID(''') + '''))';

Insert  @objects
Exec    sp_executeSQL @SQL;

With    test As
(
        Select  ss.name As SchemaName,
                so.name As TableName,
                so.id,
                so.xtype
        From    sys.sql_expression_dependencies sed
        Join    @objects vo
                On  sed.referencing_id = vo.ID
        Join    sysobjects so
                On  sed.referenced_id = so.id
        Join    sys.schemas ss
                On  so.uid = ss.schema_id
        Union   All
        Select  ss.name As SchemaName,
                so.name As TableName,
                so.id,
                so.xtype
        From    test
        Join    sys.sql_expression_dependencies sed
                On  sed.referencing_id = test.id
                And sed.referencing_id <> sed.referenced_id
        Join    sysobjects so
                On  sed. referenced_id = so.id
        Join    sys.schemas ss
                On  so.uid = ss.schema_id
)
Select  Distinct *
From    test
Union
Select  *
From    @objects;

Recursive relationship on a many to many table

I'm not familiar with Firebird, but this works in SQL Server, so is hopefully similar / enough to get you on track:

WITH TEST (IDRoot, IDPARENT, IDCHILD) AS
(

  SELECT P0.IDPROD, C0.IDParent, C0.IDCHILD
  FROM PROD AS P0
  left outer join COMP C0 on C0.IDParent = P0.IDPROD
  WHERE P0.IDProd not in (select IDChild from COMP)

  UNION ALL

  SELECT T.IDRoot, C1.IDPARENT, C1.IDCHILD
  FROM COMP AS C1
  inner join TEST AS T on T.IDCHILD = C1.IDPARENT

)
SELECT * FROM TEST

Hope that helps.

SQL Fiddle Version: http://sqlfiddle.com/#!6/22f84/7

Notes

Includ a column to denote the root of the tree as well as parent/child - since there may be multiple trees if we're not specifying a particular root:

WITH TEST (IDRoot, IDPARENT, IDCHILD) AS

Treat any product which is not a child as a ROOT (i.e. first item in a tree).

WHERE P0.IDProd not in (select IDChild from COMP)

EDIT: Answer to comments

Query on any node to see all of its relatives:

The simple way to filter on any node would be to amend the above statement's WHERE P0.IDProd not in (select IDChild from COMP) with WHERE P0.IDProd = IdImInterestedIn. However if you want to use the CTE for a view, then run queries over this static query you could use the code below - you can then filter on IDProd (select * from test where IDProd = IdImInterestedIn) to see that item's ancestors and descendants.

WITH TEST (IDProd, IDRelation, Generation) AS
(

  SELECT IDPROD
  , IDPROD
  , 0
  FROM PROD 

  UNION ALL

  SELECT T.IDPROD
  , C.IdParent 
  , T.Generation - 1
  FROM TEST AS T
  inner join Comp as C 
  on C.IdChild = T.IDRelation
  where t.Generation <= 0

  UNION ALL

  SELECT T.IDPROD
  , C.IdChild 
  , T.Generation + 1
  FROM TEST AS T
  inner join Comp as C 
  on C.IdParent = T.IDRelation
  where t.Generation >= 0

)
SELECT * 
FROM TEST
order by IDProd, Generation

SQL Fiddle: http://sqlfiddle.com/#!6/22f84/15

See a root node's full tree in a single column

WITH TEST (IDRoot, IDPARENT, IDCHILD, TREE) AS
(

  SELECT P0.IDPROD, C0.IDParent, C0.IDCHILD, cast(P0.IDPROD as nvarchar(max)) + coalesce(', ' + cast(C0.IDCHILD as nvarchar(max)),'')
  FROM PROD AS P0
  left outer join COMP C0 on C0.IDParent = P0.IDPROD
  WHERE P0.IDProd not in (select IDChild from COMP)

  UNION ALL

  SELECT T.IDRoot, C1.IDPARENT, C1.IDCHILD, TREE + coalesce(', ' + cast(C1.IDCHILD as nvarchar(max)),'')
  FROM COMP AS C1
  inner join TEST AS T on T.IDCHILD = C1.IDPARENT

)
SELECT * 
FROM TEST 
order by IDRoot

SQL Fiddle: http://sqlfiddle.com/#!6/22f84/19

EDIT: Answer to Additional Comments

with cte (tree_root_no, tree_row_no, relation_sort, relation_chart, Name, id, avoid_circular_ref) as
(
       select row_number() over (order by p.idprod) 
       , 1 
       , cast(row_number() over (order by p.idprod) as nvarchar(max)) 
       , cast('-' as nvarchar(max)) 
       , p.NAME 
       , p.IDPROD 
       , ',' + cast(p.IDPROD as nvarchar(max)) + ',' 
       from PROD p
       where p.IDPROD not in (select IDCHILD from COMP) --if it's nothing's child, it's a tree's root

       union all

       select cte.tree_root_no
       , cte.tree_row_no + 1
       , cte.relation_sort + cast(row_number() over (order by p.idprod) as nvarchar(max)) 
       , replace(relation_chart,'-','|') + ' -' 
       , p.NAME
       , p.IDPROD
       , cte.avoid_circular_ref + cast(p.IDPROD as nvarchar(max)) + ','
       from cte
       inner join COMP c on c.IDPARENT = cte.id
       inner join PROD p on p.IDPROD = c.IDCHILD
       where charindex(',' + cast(p.IDPROD as nvarchar(max)) + ',', cte.avoid_circular_ref) = 0
)
select tree_root_no, tree_row_no, relation_sort, relation_chart, id, name 
from cte 
order by tree_root_no, relation_sort

SQL Fiddle: http://sqlfiddle.com/#!6/4397f/9

Update to show each path

This one's a nasty hack, but the only way I could think of to solve your puzzle; this gives each path through the trees its own number:

;with inner_cte (parent, child, sequence, treePath) as (

    select null
    , p.IDPROD
    , 1
    , ',' + CAST(p.idprod as nvarchar(max)) + ','
    from @prod p
    where IDPROD not in 
    (
        select IDCHILD from @comp
    )

    union all

    select cte.child
    , c.IDCHILD
    , cte.sequence + 1
    , cte.treePath + CAST(c.IDCHILD as nvarchar(max)) + ','
    from inner_cte cte
    inner join @comp c on c.IDPARENT = cte.child

)
, outer_cte (id, value, pathNo, sequence, parent, treePath) as
(
    select icte.child, p.NAME, ROW_NUMBER() over (order by icte.child), icte.sequence, icte.parent, icte.treePath
    from inner_cte icte
    inner join @prod p on p.IDPROD = icte.child
    where icte.child not in (select coalesce(icte2.parent,-1) from inner_cte icte2) 

    union all

    select icte.child, p.NAME, octe.pathNo,icte.sequence, icte.parent, icte.treePath
    from outer_cte octe
    inner join inner_cte icte on icte.child = octe.parent and CHARINDEX(icte.treePath, octe.treePath) > 0
    inner join @prod p on p.IDPROD = icte.child

)
select id, value, pathNo
from outer_cte
order by pathNo, sequence

SQL Fiddle here: http://sqlfiddle.com/#!6/5a16e/1

Recursively list concents of Oracle's DBA_DEPENDENCIES view

You want to specify the NOCYCLE keyword after your CONNECT BY:

i.e.

SELECT NAME, 
       TYPE, 
       REFERENCED_NAME, 
       REFERENCED_TYPE 
  FROM DBA_DEPENDENCIES 
 WHERE OWNER='FOO' 
   AND NAME='VIEW_01' 
CONNECT BY NOCYCLE
  PRIOR REFERENCED_NAME = NAME;

There is more info on NOCYCLE and the "CONNECT_BY_ISCYCLE" keywords here:
http://www.dba-oracle.com/t_advanced_sql_connect_by_loop.htm

and here:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/pseudocolumns001.htm

Hope it helps...

EDIT: After comments, you have missed the START WITH clause.

SELECT NAME, 
       TYPE, 
       REFERENCED_NAME, 
       REFERENCED_TYPE 
  FROM DBA_DEPENDENCIES 
 WHERE OWNER='FOO' 
 START WITH NAME='VIEW_01' 
CONNECT BY NOCYCLE
  PRIOR REFERENCED_NAME = NAME;

BTW, keeping the OWNER='FOO' where clause limits any dependencies returned to just FOO's object so you may possibly miss dependencies from other schemas.

Edit 2:
The primary key of a table of view is owner, name thus the select should start with both and connect by both. You can use where to filter out desired results.

SELECT OWNER, NAME, TYPE,  
   REFERENCED_OWNER,
   REFERENCED_NAME, 
   REFERENCED_TYPE 
FROM DBA_DEPENDENCIES 
-- where referenced_type='TABLE'
START WITH owner = 'FOO' AND NAME='VIEW_01' 
CONNECT BY NOCYCLE
   PRIOR REFERENCED_NAME = NAME
   AND PRIOR REFERENCED_OWNER = OWNER;

Build foreign key graph query dynamically from given table

As suggested by "Nathan Skerl" I used this link http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=97454 and it did the job. The only thing is that I modified fnCascadingDelete function to include RecordID parameter so you can pass id of the record you want to delete then I searched for %1 and replaced it with @RecordID. It works perfectly, a bit slow for very deep nesting but I guess it is expected.

Building a Table Dependency Graph with a Recursive Query