Collapse / concatenate / aggregate multiple columns to a single comma separated string within each group
We can group by 'A', 'B', and use summarise_at
to paste
all the non-NA elements
library(dplyr)
data %>%
group_by(A, B) %>%
summarise_at(vars(-group_cols()), ~ toString(.[!is.na(.)]))
# A tibble: 2 x 5
# Groups: A [2]
# A B C D E
# <dbl> <dbl> <chr> <chr> <chr>
#1 111 100 1, 2 15, 16, 17 1
#2 222 200 1, 2 18, 19, 20 1
If we need to pass custom delimiter, use paste
or str_c
library(stringr)
data %>%
group_by(A, B) %>%
summarise_at(vars(-group_cols()), ~ str_c(.[!is.na(.)], collapse="_"))
Or using base R
with aggregate
aggregate(. ~ A + B, data, FUN = function(x)
toString(x[!is.na(x)]), na.action = NULL)
Concatenate several columns to comma separated strings by group
You can use aggregate
with paste
for each one and merge
at the end:
x <- structure(list(SNP = structure(c(1L, 1L, 2L, 3L, 4L, 4L, 5L,
5L), .Label = c("chr1.111642529", "chr1.111801684", "chr1.111925084",
"chr1.11801605", "chr1.151220354"), class = "factor"), hu_mRNA = structure(c(3L,
4L, 2L, 7L, 1L, 8L, 5L, 6L), .Label = c("AK027740", "BC098118",
"NM_002107", "NM_005324", "NM_018913", "NM_018918", "NM_020435",
"NM_032849"), class = "factor"), gene = structure(c(4L, 5L, 1L,
3L, 1L, 2L, 6L, 7L), .Label = c("<NA>", "C13orf33", "GJC2", "H3F3A",
"H3F3B", "PCDHGA10", "PCDHGA5"), class = "factor")), .Names = c("SNP",
"hu_mRNA", "gene"), class = "data.frame", row.names = c(NA, -8L
))
a1 <- aggregate(hu_mRNA~SNP,data=x,paste,sep=",")
a2 <- aggregate(gene~SNP,data=x,paste,sep=",")
merge(a1,a2)
SNP hu_mRNA gene
1 chr1.111642529 NM_002107, NM_005324 H3F3A, H3F3B
2 chr1.111801684 BC098118 <NA>
3 chr1.111925084 NM_020435 GJC2
4 chr1.11801605 AK027740, NM_032849 <NA>, C13orf33
5 chr1.151220354 NM_018913, NM_018918 PCDHGA10, PCDHGA5
Concatenate several columns as comma-separated string
By using NULLIF
you can achieve it.
SELECT Id, STUFF(COALESCE(N',' + NULLIF(Name1, ''), N'') + COALESCE(N',' + NULLIF(Name2, ''), N'')
+ COALESCE(N',' + NULLIF(Name3, ''), N''), 1, 1, '') AS ConcateStuff
FROM #Temp;
Result
Id ConcateStuff
-----------------
1 Name1,Name3
2 Name1,Name2,Name3
3 Name3
4 Name3
Concatenating multiple column values inside a variable by comma separated
You can use group by
to group the records by for example UserName
and then aggregate the client names using string.Join(",",ClientName)
to concatenate the client names.
Here is a sample code:
var userClients = from c in (dbContext joined tables)
group c by c.UserName into u
select new {
UserName = u.First().UserName,
ClientName = string.Join(",", (from n in u select n.ClientName).ToArray())
};
Check the sample code in this demo
Joining multiple rows into comma separated strings by group in Python
Try this -
- Create a dictionary that has all the required columns except
ID
as key andlambda x: list(x)
as function. - Use
groupby
withagg
to apply the independent functions on each column. - If you want to convert the
list
to a concatenated string, then just change the lambda function tolambda x: ', '.join(list(x))
More details on how to work with complex groupby and aggregates can be found on my blog here, if you are interested.
g = {i:lambda x: ', '.join(list(x)) for i in df.columns[1:]}
output = df.groupby(['ID']).agg(g).reset_index()
print(output)
ID Award Type Date
0 01 PELL, SCH FED, LOC 2021-06-01, 2021-06-01
1 02 SCH LOC 2021-06-04
2 03 GRANT, PELL, SCH STA, FED, LOC 2021-06-02, 2021-06-15, 2021-07-01
EDIT:
If the goal is to only get a string with comma separation, then a shorter way as suggested by @Henry Ecker is ..
output = df.groupby(['ID'], as_index=False).agg(', '.join)
.. using only the aggregate with the method itself.
Concatenate SQL columns with comma separated
You can concat
separators conditionally. This will output an empty string if either of the columns are null or empty.
select concat(col1,
case when len(col2)>1 then ',' else '' end,
col2,
case when len(col3)>1 then ',' else '' end,
col3)
from your_table;
To output null if either of the columns are null or empty, wrap the concat
inside a nullif
like this
select nullif(concat(col1,
case when len(col2)>1 then ',' else '' end,
col2,
case when len(col3)>1 then ',' else '' end,
col3),'')
from your_table;
SQL Server Concatenate three different columns into a Comma-Separated without repeated values
Without using window functions
. The union
might slow things down, but give it a try and see if you can tolerate the performance.
with
cte1 (id, col, indicator) as
(select id, column_a, 'col1' from t union
select id, column_b, 'col2' from t union
select id, column_c, 'col3' from t),
cte2 (id, indicator, agg) as
(select id, indicator, string_agg(col,',')
from cte1
group by id, indicator)
select id,
max(case when indicator='col1' then agg end) as column_a,
max(case when indicator='col2' then agg end) as column_b,
max(case when indicator='col3' then agg end) as column_c
from cte2
group by id;
Pandas groupby concat ungrouped column into comma separated string
Try groupby
and agg
like so:
(df.groupby(['col1', 'col2', 'col3'])['doc_no']
.agg(['count', ('doc_no', lambda x: ','.join(map(str, x)))])
.sort_values('count', ascending=False)
.reset_index())
col1 col2 col3 count doc_no
0 a x f 3 0,1,5
1 d x t 2 5,6
2 b x g 1 2
3 b y g 1 3
4 c x t 1 3
5 c y t 1 4
agg
is simple to use because you can specify a list of reducers to run on a single column.
Return grouped multiple concatenated columns to a comma delimited string column
Select A.*
,DistplayString = (Select Stuff((Select Distinct concat(', ',CarName,' (',Milage,'km)')
From YourTable
Where Year=A.Year and Category=A.Category
For XML Path ('')),1,2,'') )
From (Select Distinct Year, Category From YourTable) A
Returns (Thanks to Alan's Table Variable +1)
Year Category DistplayString
2012 GroupA Mercedes (200km), Porsche (100km)
2013 GroupA Ferrari (300km)
2013 GroupB Beetle (200km), Uno (200km)
Related Topics
Add Max Value to a New Column in R
The Condition Has Length > 1 and Only the First Element Will Be Used in If Else Statement
Should I Use a Data.Frame or a Matrix
How to Make Graphics with Transparent Background in R Using Ggplot2
Dplyr on Data.Table, am I Really Using Data.Table
Export a Graph to .Eps File with R
How to Change the Background Color of a Plot Made with Ggplot2
Set Certain Values to Na with Dplyr
Plot Data in Descending Order as Appears in Data Frame
Solution. How to Install_Github When There Is a Proxy
Ggplot2 Heatmaps: Using Different Gradients for Categories
Global Variables in Packages in R
Creating a Local R Package Repository
Convert All Data Frame Character Columns to Factors
Extract Prediction Band from Lme Fit