Dynamic column names in data.table
From data.table 1.9.4
, you can just do this:
## A parenthesized symbol, `(cn)`, gets evaluated to "blah" before `:=` is carried out
test_dtb[, (cn) := mean(a), by = id]
head(test_dtb, 4)
# a b id blah
# 1: 41 19 1 54.2
# 2: 4 99 2 50.0
# 3: 49 85 3 46.7
# 4: 61 4 4 57.1
See Details in ?:=
:
DT[i, (colvector) := val]
[...] NOW PREFERRED [...] syntax. The parens are enough to stop the LHS being a symbol; same as
c(colvector)
Original answer:
You were on exactly the right track: constructing an expression to be evaluated within the call to [.data.table
is the data.table way to do this sort of thing. Going just a bit further, why not construct an expression that evaluates to the entire j
argument (rather than just its left hand side)?
Something like this should do the trick:
## Your code so far
library(data.table)
test_dtb <- data.table(a=sample(1:100, 100),b=sample(1:100, 100),id=rep(1:10,10))
cn <- "blah"
## One solution
expr <- parse(text = paste0(cn, ":=mean(a)"))
test_dtb[,eval(expr), by=id]
## Checking the result
head(test_dtb, 4)
# a b id blah
# 1: 30 26 1 38.4
# 2: 83 82 2 47.4
# 3: 47 66 3 39.5
# 4: 87 23 4 65.2
How to assign dynamic column names in data.table under `:=`?
We can place the values in a list
or use .(...)
and then assign (:=
) it to new columns
carsDT[speed < 15, paste0("col", 1:2) := list(1, 2)]
R data.table dynamic column name of group by returning new table
We can use setNames
library(data.table)
dt[, setNames(list(mean(a)), column_name), by = id]
# id mean
# 1: 1 56.8
# 2: 2 50.5
# 3: 3 50.5
# 4: 4 42.4
# 5: 5 49.9
# 6: 6 47.8
# 7: 7 60.6
# 8: 8 57.4
# 9: 9 54.6
#10: 10 34.5
data
set.seed(123)
dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name <- "mean"
Create an data.frame in R with dynamically assigned column names
Does this help?
goalsMenu <- paste("Name", 1:40, sep="")
output <- as.data.frame(matrix(rep(0, 5 + length(goalsMenu)), nrow=1))
names(output) <- c("analysis", "patient", "date", goalsMenu, "CR1", "CR2")
Basically, I create a data.frame output
with the number of columns first and name those columns in the next step. However, be aware about mdsumner's comment! This way, all columns are of class numeric
. You can deal with that later though: change the class of columns in data.frame
Pass column name in data.table using variable
Use the quote()
and eval()
functions to pass a variable to j
. You don't need double-quotes on the column names when you do it this way, because the quote()
-ed string will be evaluated inside the DT[]
temp <- quote(x)
DT[ , eval(temp)]
# [1] "b" "b" "b" "a" "a"
With a single column name, the result is a vector. If you want a data.table result, or several columns, use list form
temp <- quote(list(x, v))
DT[ , eval(temp)]
# x v
# 1: b 1.52566586
# 2: b 0.66057253
# 3: b -1.29654641
# 4: a -1.71998260
# 5: a 0.03159933
dynamic column names seem to work when := is used but not when = is used in data.table
One option is to use the base R function setNames
aggregate_mtcars <- mtcars_copy[, setNames(.(sum(carb)), new_col)]
Or you could use data.table::setnames
aggregate_mtcars <- setnames(mtcars_copy[, .(sum(carb))], new_col)
Dynamically add column names to data.table when aggregating
As mentioned in the comments by lukeA, setNames
can be used:
m <- c("blah", "foo")
test_dtb[ , setNames(list(mean(b), median(b)), m), by = id]
Related Topics
Techniques for Finding Near Duplicate Records
Is There a Way of Manipulating Ggplot Scale Breaks and Labels
Sample Rows of Subgroups from Dataframe with Dplyr
Rscript Does Not Load Methods Package, R Does -- Why, and What Are the Consequences
Find K Nearest Neighbors, Starting from a Distance Matrix
Why Is Message() a Better Choice Than Print() in R for Writing a Package
Merging Two Columns into One in R
Dplyr::Mutate to Add Multiple Values
How to Install a R Package on a Offline Debian MAChine
How to Produce Different Geom_Vline in Different Facets in R
Plot Data in Descending Order as Appears in Data Frame
Common Legend for Multiple Plots in R
R Gotcha: Logical-And Operator for Combining Conditions Is & Not &&
How to Fit a Smooth Curve to My Data in R
Similarity Scores Based on String Comparison in R (Edit Distance)