Pivoting Variable Number of Rows to Columns

PIVOTing variable number of rows to columns

If you are not going to know the values ahead of time, then you will need to look at using dynamic SQL. This will create a SQL String that will be executed, this is required because the list of columns must be known when the query is run.

The code will be similar to:

DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX),
@groupid as int

set @groupid = 3

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(GroupName)
from Columns_Table
where groupid = @groupid
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')

set @query = 'SELECT ' + @cols + '
from
(
SELECT B.GroupName, A.Value
, row_number() over(partition by a.ColumnsTableID
order by a.Value) seq
FROM Values_Table AS A
INNER JOIN Columns_Table AS B
ON A.ColumnsTableID = B.ID
where b.groupid = '+cast(@groupid as varchar(10))+'
) p
pivot
(
min(P.Value)
for P.GroupName in (' + @cols + ')
) p '

execute sp_executesql @query;

See SQL Fiddle with Demo. For groupid of 3, then the result will be:

|   KENTROSAURUS |     RAPTOR |     TREX |    TRICERATOPS |
| whatisthiseven | Itsaraptor | Jurassic | landbeforetime |
| (null) | zomg | Park | (null) |

MySQL Pivoting Rows to Dynamic Columns

The easiest solution is not to do this in SQL, but to just fetch all the data with a simple query:

SELECT person, day, time FROM WeHaveATable ORDER BY day, person, time;

Then write application code to present it in a grid however you want.

The reason that this is tricky in SQL is that SQL requires you spell out all the columns in the select-list before you prepare the query. That's before it gets a chance to read the data to know how many columns there would be for the person with the greatest number of times.

There is no way in SQL to generate "dynamic columns" by reading the data and appending more columns to the select-list based on what it discovers while reading data.

So the way to do a pivot in SQL is that you first must know how many columns.

SELECT MAX(c) FROM (SELECT COUNT(*) FROM WeHaveATable GROUP BY person) AS t;

Then form a query that numbers the rows per person/day using a window function, and use that in a pivot-table query, with one column for each time, up to the max number of times you got in the previous query.

WITH cte AS (
SELECT person, day, time, ROW_NUMBER() OVER (PARTITION BY day, person ORDER BY time) AS colno
FROM WeHaveATable;
)
SELECT day, person,
MAX(CASE colno WHEN 1 THEN time END) AS Time1,
MAX(CASE colno WHEN 2 THEN time END) AS Time2,
MAX(CASE colno WHEN 3 THEN time END) AS Time3,
MAX(CASE colno WHEN 4 THEN time END) AS Time4,
MAX(CASE colno WHEN 5 THEN time END) AS Time5,
MAX(CASE colno WHEN 6 THEN time END) AS Time6
FROM cte
GROUP BY day, person;

If this seems like a lot of confusing, meticulous work, you're right. That's why it's recommended to skip solving this in SQL. Do the simple query I showed at the top, then write application code to process the results into the grid like you want.

Convert Rows to columns using 'Pivot' in SQL Server

If you are using SQL Server 2005+, then you can use the PIVOT function to transform the data from rows into columns.

It sounds like you will need to use dynamic sql if the weeks are unknown but it is easier to see the correct code using a hard-coded version initially.

First up, here are some quick table definitions and data for use:

CREATE TABLE yt 
(
[Store] int,
[Week] int,
[xCount] int
);

INSERT INTO yt
(
[Store],
[Week], [xCount]
)
VALUES
(102, 1, 96),
(101, 1, 138),
(105, 1, 37),
(109, 1, 59),
(101, 2, 282),
(102, 2, 212),
(105, 2, 78),
(109, 2, 97),
(105, 3, 60),
(102, 3, 123),
(101, 3, 220),
(109, 3, 87);

If your values are known, then you will hard-code the query:

select *
from
(
select store, week, xCount
from yt
) src
pivot
(
sum(xcount)
for week in ([1], [2], [3])
) piv;

See SQL Demo

Then if you need to generate the week number dynamically, your code will be:

DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)

select @cols = STUFF((SELECT ',' + QUOTENAME(Week)
from yt
group by Week
order by Week
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')

set @query = 'SELECT store,' + @cols + ' from
(
select store, week, xCount
from yt
) x
pivot
(
sum(xCount)
for week in (' + @cols + ')
) p '

execute(@query);

See SQL Demo.

The dynamic version, generates the list of week numbers that should be converted to columns. Both give the same result:

| STORE |   1 |   2 |   3 |
---------------------------
| 101 | 138 | 282 | 220 |
| 102 | 96 | 212 | 123 |
| 105 | 37 | 78 | 60 |
| 109 | 59 | 97 | 87 |

SQL Pivot with varying number of rows and compound key to form columns

You were very close to the final answer. You can use a PIVOT similar to the following (See SQL Fiddle with Demo):

DECLARE @Columns AS NVARCHAR(MAX)
DECLARE @StrSQL AS NVARCHAR(MAX)

SET @Columns = STUFF((SELECT DISTINCT
',' + QUOTENAME(CONVERT(VARCHAR(4), c.YEAR) + RIGHT('00' + CONVERT(VARCHAR(2), c.MONTH), 2))
FROM tbl_BranchTargets c
FOR XML PATH('') ,
TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 1, '')

set @StrSQL = 'SELECT branchid, ' + @Columns + ' from
(
select branchid
, target
, CONVERT(VARCHAR(4), [YEAR]) + RIGHT(''00'' + CONVERT(VARCHAR(2), [MONTH]), 2) dt
from tbl_BranchTargets
) x
pivot
(
sum(target)
for dt in (' + @Columns + ')
) p '

execute(@StrSQL)

This will create the list of columns that you want at the execution time.

Pivot longer: Multiple rows to columns in R

This is not a case where pivot_long is suitable because you have variables mapped to both rows and columns, and they are not names of the columns/rows. Intead you have to extract these attributes from the variables and then build the data.frame "manually". Here's an example, I suggest checking the variable values in each step for better understanding the process here:

library(dplyr)

df <- DI_SMALL %>%
mutate_all(as.character)

row_attr <- paste0(df$V1, "/", df$V2)
row_attr <- row_attr[row_attr!= "NA/NA"]

col_attr <- df[1:4, -(1:2)] %>%
apply(MARGIN = 2, function(x) paste0(x, collapse = "/"))

values <- df[-(1:4), -(1:2)] %>%
mutate_all(as.numeric) %>%
as.matrix() %>%
c()

out <- expand.grid(row_attr, col_attr)
out <- cbind(out, values)

out <- out %>%
tidyr::separate(col = "Var1", into = c("NA.", "NA..1"), sep = "/") %>%
tidyr::separate(col = "Var2",
into = c("Country", "ISO", "Industry", "Sector"),
sep = "/")

out[1:4]

I think the results in Output and in the values of the DI_SMALL are in different scales, but other than that, this seems like the desired output.

                NA.               NA..1     Country ISO   Industry      Sector       values
1 Energy Usage (TJ) Natural Gas Afghanistan AFG Industries Agriculture 1.595045e-05
2 Energy Usage (TJ) Coal Afghanistan AFG Industries Agriculture 1.293271e-05
3 Energy Usage (TJ) Petroleum Afghanistan AFG Industries Agriculture 0.000000e+00
4 Energy Usage (TJ) Nuclear Electricity Afghanistan AFG Industries Agriculture 0.000000e+00

Pivoting wider one column to multiple columns

We may need to separate the column 'variable' before we do the pivoting to 'wide'

library(dplyr)
library(tidyr)
library(stringr)
data %>%
mutate(variable = str_replace(variable, "^(\\w+)_(\\w+)_(\\w+)",
"\\1_\\3,\\2")) %>%
separate(variable, into = c("variable", "newcol"), sep = ",") %>%
pivot_wider(names_from = newcol, values_from = c(`Mean (SD)`,
`Median (IQR)`), names_glue = "{newcol}({.value})")%>%
rename_with(~ str_remove(str_replace(.x, "\\s+\\(", "/"), "\\)"), -variable)

-output

# A tibble: 4 × 11
variable `VVV(Mean/SD)` `WWW(Mean/SD)` `XXX(Mean/SD)` `YYY(Mean/SD)` `ZZZ(Mean/SD)` `VVV(Median/IQR)` `WWW(Median/IQR)` `XXX(Median/IQR)` `YYY(Median/IQR)`
<chr> <glue> <glue> <glue> <glue> <glue> <glue> <glue> <glue> <glue>
1 VarA_Cond1 268.59 (80.6) 149.07 (39.79) 147.71 (39.65) 18.85 (10.76) 20.98 (11.34) 276 (86) 155 (40.5) 155 (41) 18 (15.5)
2 VarA_Cond2 228.49 (83.77) 113.66 (35.91) 112.64 (35.75) 24.07 (15.79) 26.36 (16.51) 241 (116) 116 (51) 116 (48) 23 (21.5)
3 VarB_Cond1 250.72 (61.53) 140.71 (30.52) 138.93 (30.37) 21.02 (10.46) 22.72 (11.05) 259 (60) 142 (36) 142 (34) 21 (15)
4 VarB_Cond2 225.98 (81.32) 112.43 (36.09) 111.1 (36.41) 24.71 (16.77) 26.59 (17.49) 244.5 (93.5) 107.5 (51.5) 107 (50.75) 24 (20.75)
# … with 1 more variable: `ZZZ(Median/IQR)` <glue>


Related Topics



Leave a reply



Submit