paste two data.table columns
Arun's comment answered this question:
dt[,new:=paste0(A,B)]
Using := in data.table with paste()
## Start with 1st three columns of example data
dt <- exampleTable[,1:3]
## Run for 1st five years
nYears <- 5
for(ii in seq_len(nYears)-1) {
y0 <- as.symbol(paste0("popYears", ii))
y1 <- paste0("popYears", ii+1)
dt[, (y1) := eval(y0)*growthRate]
}
## Check that it worked
dt
# Site growthRate popYears0 popYears1 popYears2 popYears3 popYears4 popYears5
#1: Site 1 1.1 10 11.0 12.10 13.310 14.6410 16.10510
#2: Site 2 1.2 12 14.4 17.28 20.736 24.8832 29.85984
#3: Site 3 1.3 13 16.9 21.97 28.561 37.1293 48.26809
Edit:
Because the possibility of speeding this up using set()
keeps coming up in the comments, I'll throw this additional option out there.
nYears <- 5
## Things that only need to be calculated once can be taken out of the loop
r <- dt[["growthRate"]]
yy <- paste0("popYears", seq_len(nYears+1)-1)
## A loop using set() and data.table's nice compact syntax
for(ii in seq_len(nYears)) {
set(dt, , yy[ii+1], r*dt[[yy[ii]]])
}
## Check results
dt
# Site growthRate popYears0 popYears1 popYears2 popYears3 popYears4 popYears5
#1: Site 1 1.1 10 11.0 12.10 13.310 14.6410 16.10510
#2: Site 2 1.2 12 14.4 17.28 20.736 24.8832 29.85984
#3: Site 3 1.3 13 16.9 21.97 28.561 37.1293 48.26809
Use paste inside datatables, with a string vector as an input
If you pass a vector to .SDcols
, .SD
is a data frame (and therefore list) of those columns. You can't directly paste
a data frame usefully, which is why the original code fails.
You can, however, use do.call
to invoke a function like paste
on a list to be passed as parameters, e.g.
library(data.table)
# passing parameters directly to `paste` works...
paste(x = c('a', 'b'), y = c(1, 2))
#> [1] "a 1" "b 2"
# ...but passing it a data frame gets weird (working in series instead of parallel)...
paste(data.table(x = c('a', 'b'), y = c(1, 2)))
#> [1] "c(\"a\", \"b\")" "c(1, 2)"
# ...so `do.call` turns the call here into the first version
do.call(paste, data.table(x = c('a', 'b'), y = c(1, 2)))
#> [1] "a 1" "b 2"
In context, then,
data(iris)
setDT(iris)
cols <- c("Species", "Petal.Width")
iris[, pasted := do.call(paste, .SD), .SDcols = cols]
iris[, c(cols, "pasted"), with = FALSE]
#> Species Petal.Width pasted
#> 1: setosa 0.2 setosa 0.2
#> 2: setosa 0.2 setosa 0.2
#> 3: setosa 0.2 setosa 0.2
#> 4: setosa 0.2 setosa 0.2
#> 5: setosa 0.2 setosa 0.2
#> ---
#> 146: virginica 2.3 virginica 2.3
#> 147: virginica 1.9 virginica 1.9
#> 148: virginica 2.0 virginica 2
#> 149: virginica 2.3 virginica 2.3
#> 150: virginica 1.8 virginica 1.8
Alternatives to using .SDcols
are the experimental ..
notation:
iris[, pasted := do.call(paste, .SD[, ..cols])]
or Ananda's elegant mget
, which returns a list of the variable whose names you pass it:
iris[, pasted := do.call(paste, mget(cols))]
All return the same thing.
Paste two character columns with `data.table`
Just use sep
as parameter to paste()
instead of collapse
:
dt[, new := paste(A, B, sep = ".")]
dt
# L A B new
#1: 1 g l g.l
#2: 2 h m h.m
#3: 3 i n i.n
#4: 4 j o j.o
#5: 5 k p k.p
paste0()
doesn't honor the sep
parameter (see ?paste0
).
R Using paste in data.table to subset variable number of columns and calculate rowMeans
I believe this solves the problem you were having with paste0
:
tmp <- paste0("TRAVELTIME", dt$minhr, "." , dt$minhr+1, "avg")
tmp1 <- paste0("TRAVELTIME", dt$maxhr, "." , dt$maxhr+1, "avg")
dt1 <- dt[,avg:=rowMeans(.SD[,get(tmp):get(tmp1), with=FALSE]),by=.(dt$id, dt$seqid)]
Someone will probably point out that you don't strictly need the $
in the last line, but due to the nature of the problem you were having I felt this was useful for identifying and solving the problem.
Using set() from data.table package to copy and paste values from a data frame to another, within a loop of data frame creation
I'm not quite sure why (i suspect assign
), but it seems that your data.frame (A_id, B_id...) are linked in are not different, they are just different names pointing to the same object in RAM.
A work around is to use data.table::copy
to make a copy in RAM of the object.
for (i in 1:length(letters)){
assign(paste0(letters[i],"_id"), copy(basedata))
set(get(paste0(letters[i],"_id")), NULL,j = 1L, value = Values_df[,i]) #PROBLEM
}
NB: It will solve your problem, but as @MichaelChirico said cluttering your namespace with loads of tiny tables is probably the wrong way to do this.
References: As suggested by @○Frank, here is a reference on copy versus reference of data.table objects.
paste, by and data.table in r
For completeness' sake, an official answer:
If you use paste(y,collapse=",")
instead, it should work.
How to paste in data.table using a vector as column reference?
We could use get
dt[, (var) := paste0('id_', get(var))]
-output
> dt
id amount
1: id_a 1
2: id_b 2
3: id_c 3
4: id_d 4
5: id_e 5
6: id_f 6
7: id_g 7
8: id_h 8
9: id_i 9
10: id_j 10
Or the standard way is .SD
or .SDcols
dt[, (var) := paste0('id_', .SD[[var]])]
Related Topics
Tidyr Separate Column Values into Character and Numeric Using Regex
"Nas Introduced by Coercion" During Cluster Analysis in R
Extract Coefficients from Ggplot2-Created Nls Fit
How to Read All Files in One Directory into R at Once
R Shiny: How to Change The Background Color of The Header
Combination of Expand.Grid and Mapply
How to Rotate 3D Plotly Continuous for R Shiny App
When/How/Where Is Parent.Frame in a Default Argument Interpreted
How to Remove Trailing Zeros in R Dataframe
R - Carry Last Observation Forward N Times
Sed Directory Not Found When Running R with -E Flag
R: As.Posixct Timezone and Scale_X_Datetime Issues in My Dataset
Passing a List of Arguments to a Function with Quasiquotation
Remove Whiskers in Box-Whisker-Plot
Adding Row to a Data Frame with Missing Values