Subsetting Data.Table Using Variables with Same Name as Column

Subsetting a data.table with a variable (when varname identical to colname)

If you don't mind doing it in 2 steps, you can just subset out of the scope of your data.table (though it's usually not what you want to do when working with data.table...):

wh_v1 <- my_data_table[, V1]==V1
my_data_table[wh_v1]
# V1 V2
#1: A 1
#2: A 4

data.table - subsetting based on variable whose name is a column, too

Data.table runs in the environment of the data table itself right, so you might need to specify where you want to get the value from

DT[cyl == get("cyl", envir = parent.frame())]

Subsetting data frame using variable with same name as column

You can also specify the environment you're working with:

x<-data.frame(
start=sample(3,20,replace=TRUE),
someValue=runif(20))

env<-environment()
start<-3
cat("\nDefaut scope:")
print(subset(x,start==start)) # all entries, as start==start is evaluated to TRUE

cat("\nSpecific environment:")
print(subset(x,start==get('start',env))) # second start is replaced by its value in former environment. Equivalent to subset(x,start==3)

r data.table row subset with column name as a variable

I guess you are looking for get:

library(data.table)

DT <- data.table(x1=1:11, x2=11:21)
var <- "x1"
DT[get(var)==1,]

How can I use a variable to subset an R data.table when the variable is also a column in the data.table?

This is mostly informed by Subsetting data.table using variables with same name as column, but some of them don't work cleanly so they can be rather frustrating.

This isn't much better, but it can be used universally if more than one variable is to be referenced:

subset_metadata = function(sex){
.env <- environment()
meta = data.table(sex=c("male","female","female","male","male"),
group=c("1w","2w","1w","2w","2w"),
var=rnorm(5))
meta = meta[(sex == get("sex", envir = .env)),]
return(meta)
}
subset_metadata("male")
# sex group var
# <char> <char> <num>
# 1: male 1w -0.9687163
# 2: male 2w -0.2033112
# 3: male 2w -0.4960741

Not yet, but when 1.14.3 is released, it will benefit from the use of env= (see https://rdatatable.gitlab.io/data.table/news/index.html#data-table-v1-14-3-in-development), where I suspect (without verifying yet) that one could use:

meta[ sex == .S, env = list(.S = I(sex)) ]

env is evaluated in a non-dt scope thus will catch expected argument.
I() is used to have "male" character scalar rather than a name of variable as a symbol, otherwise meta[sex==male]. In this simple use case, env is used as kind of renaming variable on the fly.

R data.table struggling with conditional subsetting when column name is predefined elsewhere

I can imagine this was very frustrating for you. I applaud the number of things you tried before posting. Here's one approach:

DT[get(column_name) == 1,]
x y
1: 1 0
2: 1 1

If you need to use column_name in J, you can use get(..column_name):

DT[,get(..column_name)]
[1] 1 1 0 0

The .. instructs evaluation to occur in the parent environment.

Another approach for using a string in either I or J is with eval(as.name(column_name)):

DT[eval(as.name(column_name)) == 1]
x y
1: 1 0
2: 1 1

DT[,eval(as.name(column_name))]
[1] 1 1 0 0

Subset in i by variable name in data.table

You can use get to (from ?get)

search by name for an object

:

dt2[get(groups[1]) > 2 & get(groups[2]) == 4]
# ID A J
#1: 1 3 4

How to select data.table columns whose name is variable

Add in the , with = FALSE

dt <- data.table(x = 1:10, y = 11:20, z = 1:10)
col <- "x"
dt[, c(col, "y"), with=FALSE]

data.table subsetting by variable containing a column name

We can use get

 dt[get(last(titles))!=0]


Related Topics



Leave a reply



Submit