Data.Table := Assignments When Variable Has Same Name as a Column

data.table := assignments when variable has same name as a column

You can always use get, which allows you to specify the environment:

dt1[1, a := get("a", envir = .GlobalEnv)]
#    a
#1: 18

Or just:

a <- 42
dt1[1, a := .GlobalEnv$a]
#    a
#1: 42

Subsetting a data.table with a variable (when varname identical to colname)

If you don't mind doing it in 2 steps, you can just subset out of the scope of your data.table (though it's usually not what you want to do when working with data.table...):

wh_v1 <- my_data_table[, V1]==V1
my_data_table[wh_v1]
#   V1 V2
#1:  A  1
#2:  A  4

Disambiguating a variable name in a function when a column with the same name as the variable exists (data.table)

With data.table development version (1.14.3), this can be done with the new env argument, see programming on data.table:

data.table::update.dev.pkg()
source = "idref"
corpus[source=="027021335",env=list(source=source)]

       idref iddoc    nom   prenom order       role Annee_soutenance              source       time_variable
1: 027021335 97466 Méhaut Philippe     0 supervisor             2011 as.character(idref) as.character(idref)

R data.table use variable name for assignment in group by

Either use setNames wrapped around the list (.(mean(xa))) column or

dt[, setNames(.(mean(xa)), cn), by = g]
#  g        sa
#1: 1 0.2010599
#2: 2 0.4710056
#3: 3 0.4871248

or the setnames after getting the summarised output

setnames(dt[, mean(xa), by = g], 'V1', cn)[]

In data.table, := operator is used for creating/modifying a column in the original dataset. But, this operator is different when used in the tidyverse context

library(dplyr)
dt %>%
    group_by(g) %>% 
    summarise(!! cn := mean(xa), .groups = 'drop')
# A tibble: 3 x 2
#      g    sa
#  <int> <dbl>
#1     1 0.201
#2     2 0.471
#3     3 0.487

How to assign dynamic column names in data.table under `:=`?

We can place the values in a list or use .(...) and then assign (:=) it to new columns

carsDT[speed < 15, paste0("col", 1:2) := list(1, 2)]

Using a variable to specify a column name within `data.table`

Data:

library(data.table)
dt = data.table(col1=letters[1:2], x=c('1','2'))

One solution is to use quote and the eval in your data.table:

y = quote(x)
dt[,eval(y):=as.numeric(eval(y))]

#> is.numeric(dt$x)
#[1] TRUE

Select / assign to data.table when variable names are stored in a character vector

Two ways to programmatically select variable(s):

with = FALSE:

 DT = data.table(col1 = 1:3)
 colname = "col1"
 DT[, colname, with = FALSE] 
 #    col1
 # 1:    1
 # 2:    2
 # 3:    3

'dot dot' (..) prefix:

 DT[, ..colname]    
 #    col1
 # 1:    1
 # 2:    2
 # 3:    3

For further description of the 'dot dot' (..) notation, see New Features in 1.10.2 (it is currently not described in help text).

To assign to variable(s), wrap the LHS of := in parentheses:

DT[, (colname) := 4:6]    
#    col1
# 1:    4
# 2:    5
# 3:    6

The latter is known as a column plonk, because you replace the whole column vector by reference. If a subset i was present, it would subassign by reference. The parens around (colname) is a shorthand introduced in version v1.9.4 on CRAN Oct 2014. Here is the news item:

Using with = FALSE with := is now deprecated in all cases, given that wrapping
the LHS of := with parentheses has been preferred for some time.

colVar = "col1"

DT[, (colVar) := 1]                             # please change to this
DT[, c("col1", "col2") := 1]                    # no change
DT[, 2:4 := 1]                                  # no change
DT[, c("col1","col2") := list(sum(a), mean(b))]  # no change
DT[, `:=`(...), by = ...]                       # no change

See also Details section in ?`:=`:

DT[i, (colnamevector) := value]
# [...] The parens are enough to stop the LHS being a symbol

And to answer further question in comment, here's one way (as usual there are many ways) :

DT[, colname := cumsum(get(colname)), with = FALSE]
#    col1
# 1:    4
# 2:    9
# 3:   15

or, you might find it easier to read, write and debug just to eval a paste, similar to constructing a dynamic SQL statement to send to a server :

expr = paste0("DT[,",colname,":=cumsum(",colname,")]")
expr
# [1] "DT[,col1:=cumsum(col1)]"

eval(parse(text=expr))
#    col1
# 1:    4
# 2:   13
# 3:   28

If you do that a lot, you can define a helper function EVAL :

EVAL = function(...)eval(parse(text=paste0(...)),envir=parent.frame(2))

EVAL("DT[,",colname,":=cumsum(",colname,")]")
#    col1
# 1:    4
# 2:   17
# 3:   45

Now that data.table 1.8.2 automatically optimizes j for efficiency, it may be preferable to use the eval method. The get() in j prevents some optimizations, for example.

Or, there is set(). A low overhead, functional form of :=, which would be fine here. See ?set.

set(DT, j = colname, value = cumsum(DT[[colname]]))
DT
#    col1
# 1:    4
# 2:   21
# 3:   66

Data.Table := Assignments When Variable Has Same Name as a Column