Repeating rows based on column value in each row
Supposing you won't generate more than 1000 rows per row:
with num as (select level as rnk from dual connect by level<=1000)
select Job, Quantity, Status, Repeat, rnk
from t join num on ( num.rnk <= repeat )
order by job, rnk;
Here is a test:
http://sqlfiddle.com/#!4/4519f/12
UPDATE: As Jeffrey Kemp said, you can "detect" the maximum with a subquery:
with num as (select level as rnk
from dual
connect by level<=(select max(repeat) from t)
)
select job, quantity, status, repeat, rnk
from t join num on ( num.rnk <= repeat )
order by job, rnk;
Repeat rows in a Polars DataFrame based on column value
You were close. What you were looking for was the repeat_by
expression.
First some data. I'm going to add an ID
column, just to show how to apply the repeat_by
expression to multiple columns (but exclude Quantity
).
import polars as pl
df = (
pl.DataFrame({
'ID' : [100, 200],
'Fruit': ["Apple", "Banana"],
'Quantity': [2, 3],
})
)
df
shape: (2, 3)
┌─────┬────────┬──────────┐
│ ID ┆ Fruit ┆ Quantity │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i64 │
╞═════╪════════╪══════════╡
│ 100 ┆ Apple ┆ 2 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana ┆ 3 │
└─────┴────────┴──────────┘
The Algorithm
(
df
.select(
pl.exclude('Quantity').repeat_by('Quantity').explode()
)
.with_column(
pl.lit(1).alias('Quantity')
)
)
shape: (5, 3)
┌─────┬────────┬──────────┐
│ ID ┆ Fruit ┆ Quantity │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i32 │
╞═════╪════════╪══════════╡
│ 100 ┆ Apple ┆ 1 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 100 ┆ Apple ┆ 1 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana ┆ 1 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana ┆ 1 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana ┆ 1 │
└─────┴────────┴──────────┘
How it works
The repeat_by
expression will repeat a value in a Series by the value in another column/expression. In this case, we want to repeat by the value in Quantity
.
We'll also use the exclude
expression to apply repeat_by
to all columns except Quantity
(which we'll replace later).
Note that the result of repeat_by
is a list.
(
df
.select(
pl.exclude('Quantity').repeat_by('Quantity')
)
)
shape: (2, 2)
┌─────────────────┬────────────────────────────────┐
│ ID ┆ Fruit │
│ --- ┆ --- │
│ list[i64] ┆ list[str] │
╞═════════════════╪════════════════════════════════╡
│ [100, 100] ┆ ["Apple", "Apple"] │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [200, 200, 200] ┆ ["Banana", "Banana", "Banana"] │
└─────────────────┴────────────────────────────────┘
Next, we use explode
, which will take each element of each list and place it on its own row.
(
df
.select(
pl.exclude('Quantity').repeat_by('Quantity').explode()
)
)
shape: (5, 2)
┌─────┬────────┐
│ ID ┆ Fruit │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪════════╡
│ 100 ┆ Apple │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 100 ┆ Apple │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 200 ┆ Banana │
└─────┴────────┘
From there, we use the lit
expression to add Quantity
back to the DataFrame.
Repeat rows in a pandas DataFrame based on column value
reindex
+ repeat
df.reindex(df.index.repeat(df.persons))
Out[951]:
code . role ..1 persons
0 123 . Janitor . 3
0 123 . Janitor . 3
0 123 . Janitor . 3
1 123 . Analyst . 2
1 123 . Analyst . 2
2 321 . Vallet . 2
2 321 . Vallet . 2
3 321 . Auditor . 5
3 321 . Auditor . 5
3 321 . Auditor . 5
3 321 . Auditor . 5
3 321 . Auditor . 5
PS: you can add.reset_index(drop=True)
to get the new index
Repeat Rows N Times According to Column Value
You could do that with a recursive CTE using UNION ALL
:
;WITH cte AS
(
SELECT * FROM Table1
UNION ALL
SELECT cte.[ID], cte.ProductFK, (cte.[Order] - 1) [Order], cte.Price
FROM cte INNER JOIN Table1 t
ON cte.[ID] = t.[ID]
WHERE cte.[Order] > 1
)
SELECT [ID], ProductFK, 1 [Order], Price
FROM cte
ORDER BY 1
Here's a working SQLFiddle.
Here's a longer explanation of this technique.
Since your input is too large for this recursion, you could use an auxillary table to have "many" dummy rows and then use SELECT TOP([Order])
for each input row (CROSS APPLY
):
;WITH E00(N) AS (SELECT 1 UNION ALL SELECT 1),
E02(N) AS (SELECT 1 FROM E00 a, E00 b),
E04(N) AS (SELECT 1 FROM E02 a, E02 b),
E08(N) AS (SELECT 1 FROM E04 a, E04 b),
E16(N) AS (SELECT 1 FROM E08 a, E08 b)
SELECT t.[ID], t.ProductFK, 1 [Order], t.Price
FROM Table1 t CROSS APPLY (
SELECT TOP(t.[Order]) N
FROM E16) ca
ORDER BY 1
(The auxillary table is borrowed from here, it allows up to 65536 rows per input row and can be extended if required)
Here's a working SQLFiddle.
Repeating rows but changing column value each time
I think your best bet here is some recursive CTE:
WITH RECURSIVE quantitySpreader AS
(
/*Recursive Seed (starting point)*/
SELECT
ID,
CASE WHEN Quantity >= 200 then 200 ELSE Quantity END as Quantity,
Status,
1 as Depth,
CASE WHEN test.Quantity >= 200 THEN test.Quantity - 200 ELSE 0 END as remainder
FROM test
UNION ALL
/*Recursive member (sql that iterates until join fails)*/
SELECT
quantitySpreader.ID,
CASE WHEN remainder >= 200 THEN 200 ELSE remainder END,
quantitySpreader.Status,
depth + 1,
Case when remainder >= 200 THEN remainder - 200 else 0 END
FROM
quantitySpreader
INNER JOIN test
ON quantitySpreader.ID = test.ID
AND quantitySpreader.Quantity >= 200
WHERE depth <= 10
)
SELECT id, quantity, status
FROM quantitySpreader
ORDER BY id, quantity DESC;
This can get a little heady, but Recursive sql like this is split into two chunks inside that CTE.
- The recursive starting point/seed. This defines the starting point for iterating. Here we want every record (so no WHERE clause is present) and we establish the first iteration. We want "200" unless the quantity is less than 200, then just the quantity. We are also tracking the depth of recursiveness (to keep us from cycling endlessly) as well as the remainder after we subtract that 200.
- After the UNION ALL is the recursive member. This SELECT statement will repeat over and over and over again referring to its own result set (
quantitySpread
) until the JOIN fails and returns nothing. Each iteration we do the same logic as above. Check if the quantity is over 200, and if so, set the output to 200 and recalculate the remainder for the next iteration.
SQLFiddle of this in action It's running on Postgres, but the syntax is nearly identical for SQL Server so it should just be a copy/paste job.
Input:
CREATE TABLE test (id int, Quantity int, Status varchar(10));
INSERT INTO test VALUES (1, 250, 'OK');
INSERT INTO test VALUES (2, 440, 'HOLD');
Output:
id | quantity | status |
---|---|---|
1 | 200 | OK |
1 | 50 | OK |
2 | 200 | HOLD |
2 | 200 | HOLD |
2 | 40 | HOLD |
Duplicate rows based on other columns containing values, then return row with split column value
Try this:
df.assign(Group=df['Group'].str.split('-')).explode('Group')
Output:
Date End Time Group Assignment
0 2/2/2021 1130 A quiz
0 2/2/2021 1130 B quiz
0 2/2/2021 1130 C quiz
1 2/2/2021 1230 XYZ test
2 1/22/2021 1330 B paper
2 1/22/2021 1330 D paper
3 1/22/2021 1130 A homework
3 1/22/2021 1130 E homework
3 1/22/2021 1130 C homework
Using assign
we can reassign Group
as a list of strings delimited by '-' using str accessor and split
. Then using pd.DataFrame.explode
we can explode that list to create the rows in the dataframe for each element in the list.
How to create duplicate rows based on a column values
You can use Hierarchical Query
select T1.ID, T1.TEXT
from TestTable1 t1
join TestTable2 t2
on T1.ID = T2.ID
connect by level <= T2.repeat
and prior T1.ID = T1.ID
and prior sys_guid() is not null;
Demo
Related Topics
Sqlite Insert Taking Long Time
Sql Efficiency - [=] Vs [In] Vs [Like] Vs [Matches]
Sql Parentheses Use in an or Clause
Creating a Table from a Query Using a Different Tablespace (Oracle Sql)
How to Use a Case Statement in Scalar Valued Function in Sql
Undo Log Error: No More Space Left Over in System Tablespace for Allocating Undo Log Pages
Does "Select for Update" Prevent Other Connections Inserting When the Row Is Not Present
What Are The Disadvantages of Having Many Indices
Why Can't I Create a View Inside of a Begin ... End Block
Sql Server 2005:Charindex Starting from The End
Can't Connect to Msql Server After Upgrading It on Linux
Oracle SQL "Select Date from Datetime Field "
What Does "Where 1" Mean in Sql
How to Check for The SQL Server Version Using Powershell
Linked Access Db "Record Has Been Changed by Another User"