data table string concatenation of SD columns for by group values
You can concatenate all columns in using lapply
.
dt[, lapply(.SD, paste0, collapse=" "), by = ID]
## ID a b
## 1: 1 a b c A B C
## 2: 2 d e f g D E F G
## 3: 3 h i j H I J
Using newline characters as a ollapse argument instead of " "
does work, but does not print as you seem to expect in your desired output.
dt[, lapply(.SD, paste0, collapse="\n"), by = ID]
## ID a b
## 1: 1 a\nb\nc A\nB\nC
## 2: 2 d\ne\nf\ng D\nE\nF\nG
## 3: 3 h\ni\nj H\nI\nJ
As pointed out in the comments by @Frank, the question has been changed to have ,
as a seperator instead of \n
. Of course you can just change the collapse
argument to ","
. If you want to have a space as well ", "
, then the solution by @DavidArenburg is preferable.
dt[, lapply(.SD, paste0, collapse=","), by = ID]
dt[, lapply(.SD, toString), by = ID]
R data table group values into vector by group
We need a list
column. The .(
is a concise syntax for list(
in data.table
.
dat[, .(j = .(j)), by = .(gr, day)]
-output
gr day j
1: 9 1 3,8
2: 9 2 9,11
3: 10 1 10,28
4: 10 2 5
i.e. it can be otherwise written as
dat[, list(j = list(j)), .(gr, day)]
gr day j
1: 9 1 3,8
2: 9 2 9,11
3: 10 1 10,28
4: 10 2 5
concatenate names and values across columns in data.table, row by row
We could paste
the corresponding names with Map
d[,
.(x, y, labs = do.call(paste, c(Map(function(u, v)
paste0(v, ": ", u), .SD, labs_to_get), sep = ", "))),
.SDcols = labs_to_get
]
-output
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
or another option is write.dcf
d[, labs := do.call(paste,
c(as.list(setdiff(capture.output(write.dcf(.SD)), "")),
sep = ", ")), 1:nrow(d)]
> d
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
Or use apply
to loop over the rows
d[, labs := apply(.SD, 1, \(x) paste(names(x), x, sep = ": ",
collapse = ", ")), .SDcols = labs_to_get]
Or using tidyverse
library(dplyr)
library(purrr)
library(stringr)
d %>%
mutate(labs = invoke(str_c, c(across(all_of(labs_to_get),
~str_c(cur_column(), ": ", .x)), sep = ", ")))
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
R data.table join with concatenation of rows for a column
Another solution based data.table:
dt2[dt1[, paste(Name, collapse="; "), by=Id], Name := i.V1, on="Id"]
dt2
Id Flavor Name
1: 1 Sweet Apple; Orange
2: 2 Bland Banana
concatenate values across columns in data.table, row by row
You can use do.call()
, using .SDcols
to supply the columns.
x[, key_ := do.call(paste, c(.SD, sep = "_")), .SDcols = names(x)]
.SDcols = names(x)
supplies all the columns of x
. You can supply any vector of names or column numbers there.
Concatenate varying number of columns by partial match (R)
You can use the Reduce
function to paste selected columns together via specifying the columns by grep
in the .SD
syntax. Here is an example of getting the results using data.table
package:
library(stringi); library(data.table)
myTable2[, paste(stri_trans_totitle(whatToMatch), "final", sep = "_") :=
lapply(whatToMatch, function(wtm) Reduce(function(x,y) paste(x, y, sep = ""),
.SD[, grep(wtm, names(myTable2)), with = F]))]
myTable2
# herenow before1 before2 before3 after1 after2 after3 Before_final After_final
# 1: 0.3399679 if and where not here blank ifandwhere nothereblank
# 2: 0.8181909 for in by through blank blank forinby throughblankblank
# 3: 0.2237681 and where mine yours ours andwhere mineyoursours
# 4: 0.6161998 and where ha hey hon andwhere haheyhon
# 5: 0.7606252 fifth eighth and where not beet fiftheighthand wherenotbeet
# 6: 0.5525105 and where not fill are andwherenot filler
Some benchmark of do.call
and Reduce
:
dim(myTable2)
# [1] 1572864 9
reduce <- function() myTable2[, paste(stri_trans_totitle(whatToMatch[1:2]), "final", sep = "_") := lapply(whatToMatch[1:2], function(wtm) Reduce(function(x,y) paste(x, y, sep = ""), .SD[, grep(wtm, names(myTable2)), with = F]))]
docall <- function() myTable2[, paste(stri_trans_totitle(whatToMatch[1:2]), "final", sep = "_") := lapply(whatToMatch[1:2], function(wtm) do.call(paste, c(sep = "", .SD[, grep(wtm, names(myTable2)), with = F])))]
microbenchmark::microbenchmark(docall(), reduce(), times = 10)
# Unit: milliseconds
# expr min lq mean median uq max neval
# docall() 707.7818 722.6037 767.8923 737.6272 852.4909 868.8202 10
# reduce() 999.4925 1009.5146 1026.6200 1020.4637 1046.7073 1067.7479 10
Concatenate strings by group with dplyr for multiple columns
For these purposes, there are the summarise_all
, summarise_at
, and summarise_if
functions. Using summarise_all
:
df %>%
group_by(Sample) %>%
summarise_all(funs(paste(na.omit(.), collapse = ",")))
# A tibble: 3 × 5
Sample group Gene1 Gene2 Gene3
<chr> <chr> <chr> <chr> <chr>
1 A 1,2 a,b
2 B 1 c
3 C 1,2,3 a,b,c d,e
How can apply a function using data-table?
You could try this:
as.data.table(mpg)[,paste(unique(manufacturer),collapse="_"),by=fl]
Or, if your function is more elaborate you could write it separately:
myfun <- function(x){
u_x <- unique(x)
return(paste(u_x,collapse="_"))
}
res <- as.data.table(mpg)[,myfun(manufacturer),by=fl]
Related Topics
Find Second Highest Value on a Raster Stack in R
Data.Table Objects Aren't Updated in Rstudio Environment Panel
How to Set Contrasts for My Variable in Regression Analysis with R
Cannot Install R Tseries, Quadprog ,Xts Packages in Linux
Means from a List of Data Frames in R
Debugging Package::Function() Although Lazy Evaluation Is Used
Na.Locf and Inverse.Rle in Rcpp
Horizontal Rule in R Markdown/Bookdown Causing Errors
How to Use Stat_Bin2D() to Compute Counts Labels in Ggplot2
Simple for Loop in R Producing "Replacement Has Length Zero" in R
Creating a Table with Individual Trials from a Frequency Table in R (Inverse of Table Function)
Adding an Image to Shiny Action Button
"Nas Introduced by Coercion" During Cluster Analysis in R
How to Draw Arrow in Ggplot2 with Annotation
Evaluate Different Logical Conditions from String for Each Row
Ggplot Legend Showing Transparency and Fill Color
How to Remove Rows with Nas Only If They Are Present in More Than Certain Percentage of Columns