Subset Based on Variable Column Name

Subset based on variable column name

This is precisely why subset is a bad tool for anything other than interactive use:

d <- data.frame(x = letters[1:5],y = runif(5))
> d[d[,'x'] == 'c',]
x y
3 c 0.3080524

Fundamentally, extracting things in R is built around [. Use it.

Subset a data.frame using a variable for the column name in the select expression

You are looking for get()

mycol = 'col1'
subset(df1, get(mycol) == 2)

how do you subset a data frame based on column names?

Saving your dataframe to a variable df:

df <-
structure(
list(
Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"),
Date = structure(
1:6,
.Label = c(
"7/13/2017 15:01",
"7/13/2017 15:02",
"7/13/2017 15:03",
"7/13/2017 15:04",
"7/13/2017 15:05",
"7/13/2017 15:06"
),
class = "factor"
),
Host_CPU = c(
1.812950134,
2.288070679,
1.563278198,
1.925239563,
5.350669861,
2.612503052
),
UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19,
38.22),
jvm1 = c(10.91, 11.13, 11.34, 11.56, 11.77, 11.99),
jvm2 = c(11.47, 11.7, 11.91, 12.13, 12.35, 12.57),
jvm3 = c(75.65,
76.88, 56.93, 58.99, 65.29, 67.97),
jvm4 = c(39.43, 40.86,
42.27, 43.71, 45.09, 45.33),
jvm5 = c(27.42, 29.63, 31.02,
32.37, 33.72, 37.71)
),
.Names = c(
"Server",
"Date",
"Host_CPU",
"UsedMemPercent",
"jvm1",
"jvm2",
"jvm3",
"jvm4",
"jvm5"
),
class = "data.frame",
row.names = c(NA,-6L)
)

df[,select] should be what youre looking for

r data.table row subset with column name as a variable

I guess you are looking for get:

library(data.table)

DT <- data.table(x1=1:11, x2=11:21)
var <- "x1"
DT[get(var)==1,]

Subsetting data based on dynamic column names

Use [[-subsetting:

DFb <- DF[DF[[Column_Name]] == "ABC",]

This is not as elegant as subset(), but it works. subset() uses "non-standard evaluation", which is very convenient for interactive use, but makes things more complicated when you want to do this kind of second-order reference.

The main thing is the [[; you could use subset(DF,DF[[Column_Name]]=="ABC") instead, the results will be (almost) equivalent (subset() automatically drops values where the criterion evaluates to NA ...)

You can do this in the dplyr package, which allows more flexibility in avoiding non-standard evaluation, but it's still a bit roundabout (there may be a better way to do this: I'm not very experienced with dplyr).

library("dplyr")    ## for filter_()
library("lazyeval") ## for interp()
colname <- "speed"
filter_(cars,interp(~ var == 4, var = as.name(colname)))

R subset dataframe column with variable when column name is escaped

You can access with brackets rather than with $, even when the key is a string:

df <- list(`01/19/17`=seq(1,10), `01/20/17`=seq(2,11))
name1 <- "01/19/17"

df[[name1]]
# [1] 1 2 3 4 5 6 7 8 9 10

Subsetting a data.table with a variable (when varname identical to colname)

If you don't mind doing it in 2 steps, you can just subset out of the scope of your data.table (though it's usually not what you want to do when working with data.table...):

wh_v1 <- my_data_table[, V1]==V1
my_data_table[wh_v1]
# V1 V2
#1: A 1
#2: A 4


Related Topics



Leave a reply



Submit