Split Multiple Columns into Multiple Rows

Split Multiple Columns into Multiple Rows

SELECT
Y.UserID,
Y.UserName,
QuestionName = 'AnswerToQuestion' + X.Which,
Response =
CASE X.Which
WHEN '1' THEN AnswerToQuestion1
WHEN '2' THEN AnswerToQuestion2
WHEN '3' THEN AnswerToQuestion3
END
FROM
YourTable Y
CROSS JOIN (SELECT '1' UNION ALL SELECT '2' UNION ALL SELECT '3') X (Which)

This performs equally well to UNPIVOT (sometimes better) and works in SQL 2000 as well.

I took advantage of the questions' similarity to create the QuestionName column, but of course this will work with varying question names.

Note that if your list of questions is long or the question names are long, you might experiment with 2 columns in the X table, one for the question number and one for the question name. Or if you already have a table with the list of questions, then CROSS JOIN to that. If some questions are NULL then easiest is to put the above query in a CTE or derived table and then add WHERE Response IS NOT NULL.

Split data grouped within cells from multiple columns into rows using Power Query Editor

I would suggest first combining the columns; then doing the split.

But when you combine the columns, you need to do this on a row-by-row basis to keep things together on the same line.

  • A list of each cell contents can be created with the Text.Split function.
  • Then the two lists can be combined using the List.Zip function.
  • Finally, we just split them up.

I use a Custom Column to create the joined lists. You can see the formula by clicking on the Added Custom applied step.

M Code

let
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Material", type text}, {"Sub", type text}, {"CAS", type text}}),

//combine the two columns
#"Added Custom" = Table.AddColumn(#"Changed Type", "list", each List.Zip({
Text.Split([Sub],"#(lf)"),
Text.Split([CAS],"#(lf)")
})),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Sub", "CAS"}),

//Expand the list and split into rows
#"Expanded list" = Table.ExpandListColumn(#"Removed Columns", "list"),
#"Extracted Values" = Table.TransformColumns(#"Expanded list", {"list", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "list", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"list.1", "list.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"list.1", type text}, {"list.2", type text}}),

//Rename the splitted columns
renamed = Table.RenameColumns(#"Changed Type1",List.Zip({Table.ColumnNames(#"Changed Type1"),Table.ColumnNames(Source)}))

in
renamed

Sample Image

Split delimited strings in multiple columns and separate them into rows

We may do this in an easier way if we make the delimiter same

library(dplyr)
library(tidyr)
library(stringr)
to_expand %>%
mutate(first = str_replace(first, "~", "|")) %>%
separate_rows(first, second, sep = "\\|")
# A tibble: 2 x 2
first second
<chr> <chr>
1 a 1~2~3
2 b 4~5~6

Splitting a column into multiple rows

You can first split Code column on comma , then explode it to get the desired output.

df['Code']=df['Code'].str.split(',')
df=df.explode('Code')

OUTPUT:

  ID  A  B  C  D Code
0 1 a z s m AB
0 1 a z s m BC
0 1 a z s m A
1 2 b x d j AD
1 2 b x d j KL
2 3 c y w j AD
2 3 c y w j KL
3 4 a x h AB
3 4 a x h BC
4 5 b y s m A
5 6 b z s h A
6 7 c x s h B

If needed, you can replace empty string by NaN

Excel - How to split semi-colon separated values in multiple columns into rows

In powerquery

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes (Source,List.Transform(Table.ColumnNames(Source), each {_, type text})),
TableTransform = Table.Combine(List.Transform(List.Transform(Table.ToRecords(#"Changed Type"), (x) => List.Transform(Record.ToList(x), each Text.Split(_,";"))), each Table.FromColumns(_, Table.ColumnNames(#"Changed Type"))))
in TableTransform

Sample Image

Update for changed requirements: right click and fill down a static column

Sample Image

Split multiple columns into rows by delimiter comma

One option is to retaining the offset of each element in each list, so we can use that information to join the relevant rows once they are unnested:

select t.id, elt_id, elt_name
from mytable t
cross join unnest( split(t.tag_id, ', ') ) elt_id with offset as rn_id
cross join unnest( split(t.tag_name, ', ') ) elt_name with offset as rn_name
where rn_id = rn_name

Excel: Split rows of table in multiple rows based on different cell values

There are multiple ways of doing this: Excel or Power Query.

Excel

This is one way of doing it under Excel. You can try the following in cell E2 (Formula 1):

=LET(input, A2:C3, fAndF, SUBSTITUTE(INDEX(input,,2) & "," & INDEX(input,,3), 
", ",","), names, INDEX(input,,1),
GETCOL, LAMBDA(x, TEXTSPLIT(TEXTJOIN(",",, x),,",",TRUE)),
mapresult, MAP(names, fAndF, LAMBDA(name,f,
LET(num, LEN(f) - LEN(SUBSTITUTE(f, ",","")) + 1, REPT(name&",",num)))),
HSTACK(GETCOL(mapresult), GETCOL(fAndF))
)

and here is the output:

sample excel file

Note: For large dataset the previous solution has a limitation, due to TEXTJOIN function. The maximum number of text you can concatenate is 253. In order to circumvent that you can use the following approach based on DROP/REDUCE/VSTACK functions to incrementally add new elements as we iterate (Formula 2):

=LET(input, A2:C3, fAndF, SUBSTITUTE(INDEX(input,,2) & "," & 
INDEX(input,,3), ", ",","),names, INDEX(input,,1),
DROP(REDUCE("", names, LAMBDA(ac, name, VSTACK(ac, LET(
ff, TEXTSPLIT(XLOOKUP(name, names, fAndF), ","),
DROP(REDUCE("", ff, LAMBDA(acc, f, VSTACK(acc, HSTACK(name,f)))),1)
)))),1)
)

Power Query

With Power Query, there is no need to use M-code all the functionalities required are provided by the UI. Here is the corresponding M-code (Advanced Editor):

let
Source = Excel.CurrentWorkbook(){[Name="TB_INPUT"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Friends", type text},
{"Family", type text}}),
RemoveExtraSpaceAfterComma = Table.ReplaceValue(#"Changed Type",", ",",",Replacer.ReplaceText,
{"Friends", "Family"}),
#"Merge Friend and Family" = Table.CombineColumns(RemoveExtraSpaceAfterComma,{"Friends", "Family"},
Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"F&F"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merge Friend and Family",
{{"F&F", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta
[Serialized.Text = true] in type {itemType}}}), "F&F")
in
#"Split Column by Delimiter"

it will produce the following output:
sample output in Power Query

Explanation

Excel

For Formula 1, we use use LET for easy reading and composition. The name fAndF represents the concatenation by a row of Friend and Family columns. We remove also the space of the comma (,) delimiter.

GETCOL is LAMBDA user-defined function. To convert an array of comma-delimited values on each row into a single column.

We use a MAP function to determine how many times we need to repeat the Name column value. The num:

LEN(f) - LEN(SUBSTITUTE(f, ",","")) + 1

is a trick to determine how many rows we need by counting the total number of commas (,).

The name mapresult generates the following output:

John,John,John,John,John,
Jane,Jane,Jane,Jane,

Now, all we need to do is to join both rows via TEXTJOIN, then split again by comma via TEXTSPLIT and this is what the user function GETCOL does.

Note: The fourth input argument in TEXTSPLIT (in GETCOL), is set to TRUE to avoid generating an empty row with the last delimiter at the end.

Finally, we combine the result via HSTACK.

For Formula 2, we take a different approach. We use a pattern for avoiding nested array error that TEXTSPLIT produces and other functions. Check the answer of this question: How to split texts from dynamic range? provided by: @JvdV. The main idea consists of combining DROP, REDUCE and VSTACK functions to produce a recursion to add rows on each new iteration. We use this idea twice. One for each names and the other one for fAndF items for a given name on first REDUCE call.

For each name on first REDUCE call we find via XLOOKUP the corresponding fAndF values, then we invoke TEXTSPLIT to get the corresponding array (ff) and for each element of ff (f) we invoke the second REDUCE call adding a 1x2 row with the name and the corresponding f value via HSTACK.

Under this pattern, we need to remove the first row, that contains the initialization value of the accumulator (ac, acc). The pattern is always the same:

DROP(REDUCE("", array, LAMBDA(acc, arr, VSTACK(acc, func))),1)

and func, is where we do the calculation to build the content of the row we want to add. Usually we need to create additional variables and we encapsulate the calculation inside a LET function call.

Power Query

Once we have the data defined in an Excel Table, then in Power Query view.

Remove extra space in the comma delimiter. We select both columns Friend and Family columns and then: Home->Replace Values searching for , and replacing it with ,.

Select again Friend and Family columns then Transform->Merge Column. We name the merged column: F&F and we indicate the Separator. Here is the output:

PQ: merge two columns

Now we need to split the column: Home -> Split Column -> By Delimiter. In the Advanced options select: Rows, indicating we want to do the spit by rows. Here are the configuration options:

PQ: Split columns by delimiter

As @Ron Rosenfeld pointed out in the comments section. Another way of doing it is as follows:

After removing extra space after the comma (#RemovedExtraSpaceAfterComma) Select Family and Friend columns, then right-click and select: Unpivot Columns. Select the resulting Value column and Home->Split Column->By Delimiter and in Advanced options select Rows. Finally, remove the Attribute column.

Here is the M-code:

let
Source = Excel.CurrentWorkbook(){[Name="TB_INPUT"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Friends", type text}, {"Family", type text}}),
RemoveExtraSpaceAfterComma = Table.ReplaceValue(#"Changed Type",", ",",",Replacer.ReplaceText,{"Friends", "Family"}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(RemoveExtraSpaceAfterComma, {"Name"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Unpivoted Columns",
{{"Value", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta
[Serialized.Text = true] in type {itemType}}}), "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Attribute"})
in
#"Removed Columns"

Split multiple columns into rows

Here's a simple base R approach in two steps:

1) split the columns:

x <- lapply(d[-1], strsplit, "|", fixed = TRUE)

2) expand and combine:

d2 <- setNames(do.call(rbind, Map(expand.grid, d$id, x$name, x$score)), names(d)) 

The result is then:

#    id name score
#1 22 c e
#2 22 e e
#3 565 m k
#4 565 q k
#5 565 m e
#6 565 q e
#7 893 w e
#8 893 w k
#9 893 w e
#10 415 w e
#11 415 s e
#12 415 e e
#13 415 w o
#14 415 s o
#15 415 e o

Split multiple rows into multiple columns

We can use a pivot query with the help of ROW_NUMBER:

WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Project) rn
FROM projects
)

SELECT
ID,
MAX(CASE WHEN rn = 1 THEN Project END) AS Col1,
MAX(CASE WHEN rn = 2 THEN Project END) AS Col2,
MAX(CASE WHEN rn = 3 THEN Project END) AS Col3,
MAX(CASE WHEN rn = 4 THEN Project END) AS Col4,
MAX(CASE WHEN rn = 5 THEN Project END) AS Col5
FROM cte
GROUP BY ID
ORDER BY ID;

If there be some other column which should determine the ordering of projects for each ID, then update the call to ROW_NUMBER.



Related Topics



Leave a reply



Submit