Connect by or Hierarchical Queries in Rdbms Other Than Oracle

CONNECT BY or hierarchical queries in RDBMS other than Oracle

There is an article on the developerworks site Port CONNECT BY to DB2 that does a nice conversion.
Also an interesting article on Explain Extended (Quassnoi's blog) that shows some difference between CONNECT BY and recursive CTE: Adjacency list vs. nested sets: Oracle, being row-based and set-based. He has also a nice article about "SQL Server: are the recursive CTE’s really set-based?". It seems that the "recursive CTE in Oracle is also not set based". I hope this helps with the conversion, recursion in JOOQ and understanding the difference of both implementations of recursion in SQL.

Regards,
JJ.

oracle ancestors in hierarchical queries

create table ancestor (name varchar2(200), parent varchar2(200));

insert into ancestor values ('john','peter');
insert into ancestor values ('peter','ronald');
insert into ancestor values ('ronald','george');
insert into ancestor values ('george',null);

 select  name,parent,relation_level from 
(

 SELECT CONNECT_BY_ROOT name name, parent,level relation_level
 FROM ancestor
 CONNECT BY name = PRIOR parent
  START WITH name IN (SELECT name FROM ancestor)
 )
where parent is not null;

SQLFIDDLE

Hierarchical query with reference to previous row, not parent row

You are probably looking for something like I show below. Note that there is no analytic (and aggregate) product function, I had to mock it up by taking logs, analytic sum, and exponential function.

I don't see why you must combine the productid and the level (I called it "step") in a single string; of course you can do that if you wish, but I show the output the way I believe it should be. Also, in the output it is not clear what weight you must show (or, rather, WHY) - the weight before the product arrives at the machine, or after it is processed by that machine? I show both (as well as the original weight before any processing began) on every row; decide what you actually need for your report.

I mocked up your inputs in the WITH clause, but of course you should use your actual tables (with their table and column names). I hope you have a table like the MACHINES view in my query. I used a left outer join just in case a machine is not actually shown in the MACHINES table, although that shouldn't be allowed. (Not sure how you can enforce that though, given your data model which is in direct violation of First Normal Form.) If a machine is not present in the MACHINES table, its yield factor is treated as 1.00 in the query. That happens with machine 'M9' in this example.

with
  sample_inputs (productid, currentweight, remainingroute) as (
    select '001', 50, 'M1-M7-M5' from dual union all
    select '002', 48, 'M3-M2-M9' from dual
  )
, machines (machine, yield_factor) as (
    select 'M1', 0.95 from dual union all
    select 'M7', 0.90 from dual union all
    select 'M3', 0.80 from dual union all
    select 'M4', 1.00 from dual union all
    select 'M6', 0.92 from dual union all
    select 'M2', 0.90 from dual union all
    select 'M5', 0.86 from dual
  )
, routes (productid, step, currentweight, machine) as (
    select productid, level, currentweight,
           regexp_substr(remainingroute, '[^-]+', 1, level)
    from   sample_inputs
    connect by  level <= regexp_count(remainingroute, '[^-]+')
            and prior productid = productid
            and prior sys_guid() is not null
  )
, weights (productid, step, original_weight, machine, weight_out) as (
    select r.productid, r.step, r.currentweight, r.machine,
           round(r.currentweight *
                   exp(sum(ln(m.yield_factor)) 
                     over (partition by r.productid order by r.step)), 2)
    from   routes r left outer join machines m on r.machine = m.machine
  )
select productid, step, original_weight, machine,
       lag(weight_out, 1, original_weight)
         over (partition by productid order by step) as weight_in, weight_out
from   weights
order  by productid, step;

Output:

PRODUCTID  STEP ORIGINAL_WEIGHT MACHINE       WEIGHT_IN      WEIGHT_OUT
---------- ---- --------------- ------- --------------- ---------------
001           1           50.00 M1                50.00           47.50
001           2           50.00 M7                47.50           42.75
001           3           50.00 M5                42.75           36.76
002           1           48.00 M3                48.00           38.40
002           2           48.00 M2                38.40           34.56
002           3           48.00 M9                34.56           34.56

hierarchical query without side rows

This is because of OR predicate in CONNECT BY: your query traverses in both directions, so on the second step of CONNECT BY you'll have all the childs of the parent and finally all the tree.

You need to split the query into union of two directions.

with
  temp_cte(id, parent_id) as
    (select 1, null from dual
     union all
     select 2, 1 from dual
     union all
     select 3, 2 from dual
     union all
     select 4, 1 from dual
     union all
     select 5, null from dual
)

select id, parent_id
from temp_cte t
where level > 1
connect by nocycle (prior id = parent_id)
start with t.id = 2

union all

select id, parent_id
from temp_cte t
connect by (prior parent_id = id)
start with t.id = 2
order by id


ID | PARENT_ID
-: | --------:
 1 |      null
 2 |         1
 3 |         2

db<>fiddle here

Is there a `connect by` alternative in MySQL?

As said in comments, there isn't a short way with mysql.

BUT!

If you have the possibility to change the database structure, you can deploy a better design to handle tree-like hierarchies.

If you follow THIS TUTORIAL from Bill Karwin (HERE is the original answer which references that slideshow tutorial), you can find 4 methods used to model an hierarchical structure:

Adiacency List
Path Enumeration
Nested sets
Closure table

Now, the best model possible is the 4th one (I leave descriptions of the other 3 models to the reader), which basically needs 2 tables: one for the elements and one for the paths. In the paths table (the closure table itself), you'll store every path from each node to every descendant (not just the direct childs!).

It's suggested to save also the path length for each row, because it makes easier to query for immediate childrens in the tree.

Even if this solution requires more space, it has the best overall performance and it's really easy to use: it doesn't rely on recursive queries at all AND it will grants referential integrity for the whole dataset!

For example, to get every child of the node #4:

select a.*
from nodes a
join paths b
on a.node_id = b.descendant
where b.ancestor = 4

Another example: get all the ancestors of the node #11

select a.*
from nodes a
join paths b
on a.node_id = b.ancestor
where b.descendant = 11

need to delete the subtree of the node #6

delete from paths where descendant in
(select descendant from paths where ancestor = 6)

How to get depth of recursive SQL using WITH rather than CONNECT BY in Oracle?

Here is how you can clone the LEVEL pseudo-column (to use for indentation) as well as the DEPTH FIRST ordering of hierarchical CONNECT BY queries when you use the recursive WITH clause. You may want to see what happens when you remove the SEARCH... clause after the recursive subquery and the ORDER BY clause at the end - see in what way the output is affected.

with r ( lvl, key, name, treehier ) as (
         select  1, key, name, treehier
           from  tree
           where treehier is null
         union all
         select  r.lvl + 1, t.key, t.name, r.key
         from    r join tree t on r.key = t.treehier
       )
       search depth first by name set ord
select   rpad(' ', 2 * (lvl - 1), ' ') || name as name, key, treehier
from     r
order by ord
;

NAME                                  KEY   TREEHIER
------------------------------ ---------- ----------
Software                               11           
  DB                                   55         11
    MS-SQL                             77         55
    Oracle                             66         55
  OS                                   22         11
    Linux                              33         22
    Windows                            44         22

Then you can add the keyword DESC (for descending sort, instead of the default ascending sort) - in the ORDER BY clause, after ord (almost certainly NOT what you want!) or, after deleting DESC from the ORDER BY clause, add it after name in the SEARCH clause. You will see what difference WHERE you add the DESC keyword makes in the output. These are all options that will be available to you in the future, if needed. You may also want to experiment by using KEY instead of NAME in the SEARCH clause - see what changes that will cause.

Oracle : Hierarchical Query Connect By

Find all the rows in the hierarchy and then order those rows by appliedOn and keep only the latest row:

Query:

WITH emp_manager_mapping ( empid, managerid, appliedon, status ) AS (
  select 'A', 'B', sysdate-100,'Success' from dual union all
  select 'C', 'D', sysdate-70, 'Success' from dual union all
  select 'B', 'C', sysdate-50, 'Success' from dual
)
SELECT managerid AS top_managerid,
       LEVEL AS depth,
       CONNECT_BY_ROOT ( empid ) AS empid,
       CONNECT_BY_ROOT ( managerid ) AS managerid
FROM   emp_manager_mapping e
WHERE  status = 'Success'
START WITH empid = 'A'
CONNECT BY NOCYCLE
       PRIOR managerid = empid
ORDER BY AppliedOn DESC
FETCH FIRST ROW ONLY;

Output:


TOP_MANAGERID | DEPTH | EMPID | MANAGERID
:------------ | ----: | :---- | :--------
C             |     2 | A     | B

db<>fiddle here

Connect by or Hierarchical Queries in Rdbms Other Than Oracle