How to select columns in data.table using a character vector of certain column names?
We can use ..
notation to find myVector
as a vector of column positions, like it would work in data.frame
mtcarsDT[, ..myVector]
According to ?data.table
In case of overlapping variables names inside dataset and in parent scope you can use double dot prefix
..cols
to explicitly refer to 'cols variable parent scope and not from your dataset.
Selecting columns of a data.table using a vector of column names or column positions without using with = F
An option is to use double dots
DT[, ..mycols]
# A C
#1: 0.1188208 -0.17328827
#2: -0.5622505 0.84231231
#3: 0.8111072 -1.59802306
#4: 0.7968823 2.08468489
# ...
Or specify it in .SDcols
DT[, .SD, .SDcols = mycols]
or else with = FALSE
as the OP mentioned in the post
How to select data.table columns whose name is variable
Add in the , with = FALSE
dt <- data.table(x = 1:10, y = 11:20, z = 1:10)
col <- "x"
dt[, c(col, "y"), with=FALSE]
Select subset of columns in data.table R
Use with=FALSE
:
cols = paste("V", c(1,2,3,5), sep="")
dt[, !cols, with=FALSE]
I suggest going through the "Introduction to data.table" vignette.
Update: From v1.10.2
onwards, you can also do:
dt[, ..cols]
See the first NEWS item under v1.10.2 here for additional explanation.
Accessing columns in data.table using a character vector of column names
You can use the data.table
syntax ..
which "looks up one level" (as in the Unix terminal) for the variable:
> all.equal(DT[,list(x,y)], DT[, ..cols])
[1] TRUE
> all.equal(DT[,.SD[,list(x,y)][min(v)]], DT[,.SD[ ,min(v)], .SDcols = cols])
[1] TRUE
More details under FAQ 1.6 I believe: http://datatable.r-forge.r-project.org/datatable-faq.pdf
Selecting a subset of columns in a data.table
Use a very similar syntax as for a data.frame
, but add the argument with=FALSE
:
dt[, setdiff(colnames(dt),"V9"), with=FALSE]
V1 V2 V3 V4 V5 V6 V7 V8 V10
1: 1 1 1 1 1 1 1 1 1
2: 0 0 0 0 0 0 0 0 0
3: 1 1 1 1 1 1 1 1 1
4: 0 0 0 0 0 0 0 0 0
5: 0 0 0 0 0 0 0 0 0
6: 1 1 1 1 1 1 1 1 1
The use of with=FALSE
is nicely explained in the documentation for the j
argument in ?data.table
:
j: A single column name, single expresson of column names, list()
of expressions of column names, an expression or function call that evaluates to list (including data.frame
and data.table
which are lists, too), or (when with=FALSE
) same as j in [.data.frame
.
From v1.10.2 onwards it is also possible to do this as follows:
keep <- setdiff(names(dt), "V9")
dt[, ..keep]
Prefixing a symbol with ..
will look up in calling scope (i.e. the Global Environment) and its value taken to be column names or numbers (source).
How to select columns programmatically in a data.table?
This is covered in FAQ 1.1, 1.2 and 2.17.
Some possibilities:
DT[, keep, with = FALSE]
DT[, c('V1', 'V3'), with = FALSE]
DT[, c(1, 3), with = FALSE]
DT[, list(V1, V3)]
The reason DF[c('V1','V3')]
works as it does for a data.frame
is covered in ?`[.data.frame`
Data frames can be indexed in several modes. When
[
and[[
are used
with a single vector index (x[i]
orx[[i]]
), they index the data frame
as if it were a list. In this usage adrop
argument is ignored, with a
warning.
From data.table 1.10.2
, you may use the ..
prefix when subsetting columns programmatically:
When
j
is a symbol prefixed with..
it will be looked up in calling scope and its value taken to be column names or numbers [...] It is experimental.
Thus:
DT[ , ..keep]
# V1 V3
# 1: 1 7
# 2: 2 8
# 3: 3 9
Specific column selection from data.table in R
You can try like this:
dt[, .SD, .SDcols = colnames]
Meanwhile, data.table gives an alternative choice in recent version:
dt[, ..colnames]
dplyr r : selecting columns whose names are in an external vector
We could use any_of
with select
library(dplyr)
data %>%
select(any_of(col_names))
-output
a b
1 1 e
2 4 e
3 13 f
4 8 m
5 10 z
6 3 y
...
Related Topics
Shared Memory in Parallel Foreach in R
How to Specify Lib Directory When Installing Development Version R Packages from Github Repository
Changing Title in Multiplot Ggplot2 Using Grid.Arrange
How to Convert Date and Time from Character to Datetime Type
Debugging (Line by Line) of Rcpp-Generated Dll Under Windows
Format a Date Column in a Data Frame
How to Use a MACro Variable in R? (Similar to %Let in Sas)
Building a Tiny R Package with Cuda and Rcpp
Ggplot2 Draw Dashed Lines of Same Colour as Solid Lines Belonging to Different Groups
R Shiny Error: Cannot Coerce Type 'Closure' to Vector of Type 'Double'
Annotating Facet Title as Strip Over Facet
How to Include Rmarkdown File in R Package
Is There an R Markdown Equivalent to \Sexpr{} in Sweave
How to Rank Within Groups in R
Summing Across Rows of a Data.Table for Specific Columns