Why does CONNECT BY LEVEL on a table return extra rows?
In the first query, you connect by just the level.
So if level <= 1, you get each of the records 1 time. If level <= 2, then you get each level 1 time (for level 1) + N times (where N is the number of records in the table). It is like you are cross joining, because you're just picking all records from the table until the level is reached, without having other conditions to limit the result. For level <= 3, this is done again for each of those results.
So for 3 records:
- Lvl 1: 3 record (all having level 1)
- Lvl 2: 3 records having level 1 + 3*3 records having level 2 = 12
- Lvl 3: 3 + 3*3 + 3*3*3 = 39 (indeed, 13 records each).
- Lvl 4: starting to see a pattern? :)
It's not really a cross join. A cross join would only return those records that have level 2 in this query result, while with this connect by, you get the records having level 1 as well as the records having level 2, thus resulting in 3 + 3*3 instead of just 3*3 record.
Oracle SQL: CONNECT BY LEVEL returning many rows
The connect by
is only based on the time, so you're connecting every time for route A with route B and vice versa.
The simple fix seems to be to make it:
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
AND ROUTE_NAME = PRIOR ROUTE_NAME
to restrict it to the single route at a time; but that then forms a loop, so you need to add in a non-deterministc function call too to prevent that; for example:
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / TIME_PERIOD
AND ROUTE_NAME = PRIOR ROUTE_NAME
AND PRIOR DBMS_RANDOM.VALUE() IS NOT NULL
which gets:
ROUTE_NAME OUTPUT_MOMENT
---------- -------------------
ROUTE A 2018-03-09 05:00:08
ROUTE A 2018-03-09 06:00:08
ROUTE A 2018-03-09 07:00:08
ROUTE A 2018-03-09 08:00:08
ROUTE A 2018-03-09 09:00:08
ROUTE A 2018-03-09 10:00:08
ROUTE A 2018-03-09 11:00:08
ROUTE A 2018-03-09 12:00:08
ROUTE A 2018-03-09 13:00:08
ROUTE A 2018-03-09 14:00:08
ROUTE A 2018-03-09 15:00:08
ROUTE B 2018-03-09 05:00:08
ROUTE B 2018-03-09 05:30:08
ROUTE B 2018-03-09 06:00:08
ROUTE B 2018-03-09 06:30:08
ROUTE B 2018-03-09 07:00:08
ROUTE B 2018-03-09 07:30:08
ROUTE B 2018-03-09 08:00:08
ROUTE B 2018-03-09 08:30:08
ROUTE B 2018-03-09 09:00:08
ROUTE B 2018-03-09 09:30:08
ROUTE B 2018-03-09 10:00:08
ROUTE B 2018-03-09 10:30:08
ROUTE B 2018-03-09 11:00:08
ROUTE B 2018-03-09 11:30:08
ROUTE B 2018-03-09 12:00:08
ROUTE B 2018-03-09 12:30:08
ROUTE B 2018-03-09 13:00:08
ROUTE B 2018-03-09 13:30:08
ROUTE B 2018-03-09 14:00:08
ROUTE B 2018-03-09 14:30:08
ROUTE B 2018-03-09 15:00:08
ROUTE B 2018-03-09 15:30:08
ROUTE B 2018-03-09 16:00:08
34 rows selected.
You could also do two connect by
queries and union the results together, possibly pulling the time range into a CTE to avoid duplicating that:
WITH START_END AS (
SELECT SYSDATE - 8 / 24 AS START_TIME,
SYSDATE + 3 / 24 AS END_TIME
FROM DUAL
)
SELECT 'ROUTE A' ROUTE_NAME,
START_TIME + (LEVEL - 1) / 24 AS OUTPUT_MOMENT
FROM START_END
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / (1 / 24)
UNION ALL
SELECT 'ROUTE B' ROUTE_NAME,
START_TIME + (LEVEL - 1) / 48 AS OUTPUT_MOMENT
FROM START_END
CONNECT BY (LEVEL - 1) <= (END_TIME - START_TIME) / (1 / 48)
Using / ( 1 / 24)
looks odd when you could instead do * 24
, but you actually get a slightly different result because of rounding errors; with the latter you get an extra row for route A. You could rearrange the logic further to avoid that confusion though.
ORACLE CONNECT BY LEVEL Producing Duplicate rows
Currently, your CONNECT BY
only limits the hierarchical level, and doesn't provide any condition for matching child rows to parent rows. This means that in a table with multiple rows, every row is a child of every other row. This is going to produce a massive result set.
If I understand correctly, you are trying to use the hierarchical functionality to pull multiple values from each individual row. So you really want each row to be parent and child to itself. I suggest trying:
CONNECT BY id = PRIOR id
AND prior sys_guid() is not null
AND level <= regexp_count(VALUE,CHR(10)||CHR(13))
Thanks to @kfinity for pointing out the need for the sys_guid() to prevent a CONNECT BY LOOP.
Oracle Connect By seems to produce too many rows
With no condition other than "level <= 4", every row from the original table, view etc. (from the join, in this case) will produce two rows at level 2, then four more rows at level 3, and 8 more at level 4. "Connect by" is essentially a succession of joins, and you are doing cross joins if you have no condition with the PRIOR operator.
You probably want to add "and prior a.id = a.id". This will lead to Oracle complaining about cycles (because Oracle decides a cycle is reached when it sees the same values in the columns subject to PRIOR). That, in turn, is solved by adding a third condition, usually "and prior sys_guid() is not null".
(Edited; the original answer made reference to NOCYCLE, which is not needed when using the "prior sys_guid() is not null" approach.)
This has been discussed recently on OTN: https://community.oracle.com/thread/3999985
Same question discussed here: https://community.oracle.com/thread/2526535
Duplicate rows using Connect by level
Oracle Setup:
CREATE TABLE My_SQL_table ( Site_NUM, start_week, end_week ) AS
SELECT 'France', 50, 52 FROM DUAL UNION ALL
SELECT 'Germany', 41, 43 FROM DUAL UNION ALL
SELECT 'USA', 12, 13 FROM DUAL;
Query: Using CONNECT BY
SELECT site_num,
COLUMN_VALUE wks_inbtwn
FROM My_SQL_table tbl1
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT tbl1.START_WEEK + LEVEL
FROM DUAL
CONNECT BY tbl1.START_WEEK + LEVEL <= tbl1.END_WEEK
)
AS SYS.ODCINUMBERLIST
)
)
Output:
SITE_NUM | WKS_INBTWN
:------- | ---------:
France | 51
France | 52
Germany | 42
Germany | 43
USA | 13
Query 2: Using a recursive sub-query factoring clause
WITH rsqfc ( site_num, start_week, end_week ) AS (
SELECT site_num, start_week + 1, end_week
FROM my_sql_table
UNION ALL
SELECT site_num, start_week + 1, end_week
FROM rsqfc
WHERE start_week < end_week
)
SELECT site_num, start_week AS wks_inbtwn
FROM rsqfc
ORDER BY site_num, wks_inbtwn
Output:
SITE_NUM | WKS_INBTWN
:------- | ---------:
France | 51
France | 52
Germany | 42
Germany | 43
USA | 13
db<>fiddle here
Why my hierarchy query is showing duplicate records?
You are not understanding how CONNECT BY
works. Here is a walkthrough of how Oracle is evaluating your 2nd query.
Without a START WITH
clause, every row in your table with be used as a starting point, or "root" in your hierarchy.
Since you have no CONNECT BY
conditions (like "columnA = PRIOR columnB"), every row in your table will be considered a child of every other row. This will happen forever, until your LEVEL <=4
condition is reached.
So,
LEVEL 1
--------
SNO 1
SNO 2
Explanation: Each row in your table is a starting point of its own hierarchy (because you have no START WITH
conditions).
LEVEL 2
--------
SNO 1 -> SNO 1
SNO 1 -> SNO 2
SNO 2 -> SNO 1
SNO 2 -> SNO 2
Explanation of those 4 rows -- both SNO 1 and SNO 2 are roots, and for each root, SNO 1 and SNO 2 are children. So, 2x2 rows = 4 rows.
LEVEL 3
-------
SNO 1 -> SNO 1 -> SNO 1
SNO 1 -> SNO 1 -> SNO 2
SNO 1 -> SNO 2 -> SNO 1
SNO 1 -> SNO 2 -> SNO 2
SNO 2 -> SNO 1 -> SNO 1
SNO 2 -> SNO 1 -> SNO 2
SNO 2 -> SNO 2 -> SNO 1
SNO 2 -> SNO 2 -> SNO 2
Explanation of those 8 rows. Starting with the 4 rows from level 2, both SNO 1 and SNO 2 are children of each, giving 4x2 = 8 rows at level 3.
Level 4, which I won't draw out, will similarly give 8x2 = 16 rows.
So, in total, you have 2 + 4 + 8 + 16 = 30 rows. (That's level 1 + level 2 + level 3 + level 4).
Then, after your CONNECT BY
processing (shown above), the WHERE
clause is applied, limiting your final results to rows where the value (at the lowest level of the hierarchy) is SNO = 1
. That is exactly half of the 30 rows, or 15 rows, which is what you are getting.
Confusion with Oracle CONNECT BY
How a CONNECT BY
query is executed and evaluated - step by step (by example).
Say we have the following table and a connect by query:
select * from mytable;
X
----------
1
2
3
4
SELECT level, m.*
FROM mytable m
START with x = 1
CONNECT BY PRIOR x +1 = x OR PRIOR x + 2 = x
ORDER BY level;
Step 1:
Select rows from table mytable
that meet a START WITH
condition, assign LEVEL = 1 to the returned result set:
CREATE TABLE step1 AS
SELECT 1 "LEVEL", X from mytable
WHERE x = 1;
SELECT * FROM step1;
LEVEL X
---------- ----------
1 1
Step 2
Increase level by 1:
LEVEL = LEVEL + 1
Join the result set returned in previous step with mytable
using CONNECT BY
conditions as the join conditions.
In this clause PRIOR column-name
refers to the resultset returned by previous step, and simple column-name
refers to the mytable
table:
CREATE TABLE step2 AS
SELECT 2 "LEVEL", mytable.X from mytable
JOIN step1 "PRIOR"
ON "PRIOR".x +1 = mytable.x or "PRIOR".x + 2 = mytable.x;
select * from step2;
LEVEL X
---------- ----------
2 2
2 3
STEP x+1
Repeat #2 until last operation returns an empty result set.
Step 3
CREATE TABLE step3 AS
SELECT 3 "LEVEL", mytable.X from mytable
JOIN step2 "PRIOR"
ON "PRIOR".x +1 = mytable.x or "PRIOR".x + 2 = mytable.x;
select * from step3;
LEVEL X
---------- ----------
3 3
3 4
3 4
Step 4
CREATE TABLE step4 AS
SELECT 4 "LEVEL", mytable.X from mytable
JOIN step3 "PRIOR"
ON "PRIOR".x +1 = mytable.x or "PRIOR".x + 2 = mytable.x;
select * from step4;
LEVEL X
---------- ----------
4 4
Step 5
CREATE TABLE step5 AS
SELECT 5 "LEVEL", mytable.X from mytable
JOIN step4 "PRIOR"
ON "PRIOR".x +1 = mytable.x or "PRIOR".x + 2 = mytable.x;
select * from step5;
no rows selected
Step 5 returned no rows, so now we finalize the query
Last step
UNION ALL
results of all steps and return it as the final result:
SELECT * FROM step1
UNION ALL
SELECT * FROM step2
UNION ALL
SELECT * FROM step3
UNION ALL
SELECT * FROM step4
UNION ALL
SELECT * FROM step5;
LEVEL X
---------- ----------
1 1
2 2
2 3
3 3
3 4
3 4
4 4
Now let's apply the above procedure to your query:
SELECT * FROM dual;
DUMMY
-----
X
SELECT LEVEL FROM DUAL CONNECT BY rownum>5;
Step 1
Since the query does not contain the START WITH
clause, Oracle selects all records from the source table:
CREATE TABLE step1 AS
SELECT 1 "LEVEL" FROM dual;
select * from step1;
LEVEL
----------
1
Step 2
CREATE TABLE step2 AS
SELECT 2 "LEVEL" from dual
JOIN step1 "PRIOR"
ON rownum > 5
select * from step2;
no rows selected
Since the last step returned no rows, we are going to finalize our query.
Last step
SELECT * FROM step1
UNION ALL
SELECT * FROM step2;
LEVEL
----------
1
The analyze of the last query:
select level from dual connect by rownum<10;
I leave to you as a homework assignment.
Inner join returning more rows then regular select
You have duplicate rows in feed_id_types. Run this to find which IDs are duplicated:
select
types.feed_type_id
from feed_id_types types
group by types.feed_type_id
having count(*) > 1
The IN()
clause ignores the duplicates, matching on the first one it finds. The inner join matches each row from daily_run
to every matching row in feed_id_types
, creating extra results.
Related Topics
Sort by Day of The Week from Monday to Sunday
Rails Order by Association Field
Get the List of Stored Procedures Created And/Or Modified on a Particular Date
Checking If a Given Date Fits Between a Range of Dates
Does Introducing Foreign Keys to MySQL Reduce Performance
Mod' Is Not a Recognized Built-In Function Name
Alter Table to Modify Default Value of Column
What Are The Disadvantages of Having Many Indices
Check Users in a Security Group in SQL Server
How to Get Information About an Index and Table Owner in Oracle
How to Get the Last Month Data and Month to Date Data
Undelete Recently Deleted Rows SQL Server
Can You Have an Inner Join Without the on Keyword
Sql Query to Convert Nvarchar to Int