Extract Columns from Data Table by Numeric Indices Stored in a Vector

Extract columns from data table by numeric indices stored in a vector

We can use double dots (..) before the object 'a' to extract the columns

dt[, ..a]
# col4 col5 col6
#1: 4 5 6
#2: 5 6 7
#3: 6 7 8
#4: 7 8 9

Or another option is with = FALSE

dt[, a, with = FALSE]

data

dt <- data.table(col1 = 1:4, col2 = 2:5, col3 = 3:6, col4 = 4:7, col5 = 5:8, col6 = 6:9)

Select multiple columns in data.table by their numeric indices

For versions of data.table >= 1.9.8, the following all just work:

library(data.table)
dt <- data.table(a = 1, b = 2, c = 3)

# select single column by index
dt[, 2]
# b
# 1: 2

# select multiple columns by index
dt[, 2:3]
# b c
# 1: 2 3

# select single column by name
dt[, "a"]
# a
# 1: 1

# select multiple columns by name
dt[, c("a", "b")]
# a b
# 1: 1 2

For versions of data.table < 1.9.8 (for which numerical column selection required the use of with = FALSE), see this previous version of this answer. See also NEWS on v1.9.8, POTENTIALLY BREAKING CHANGES, point 3.

Extract a column from a data.table as a vector, by position

A data.table inherits from class data.frame. Therefore it is a list (of column vectors) internally and can be treated as such.

is.list(DT)
#[1] TRUE

Fortunately, list subsetting, i.e. [[, is very fast and, in contrast to [, package data.table doesn't define a method for it. Thus, you can simply use [[ to extract by an index:

DT[[2]]
#[1] 3 4

R Extract values from data table given indices stored in another table

We need to show the row/column index of the same length. Here, we are trying to get the value of cell at 1, 1,. The row index is correct, but column index is a data.frame with one column (ndx[1] - based on the structure showed in the OP's post). We need to extract the 'V1' column and get the first element as column index

predict_all[1, ndx$V1[1]]
#[1] 0.01

NOTE: We assume predict_all as a data.frame

If it is a data.table, then use with = FALSE

predict_all[1, ndx$V1[1], with = FALSE]
# V1
#1: 0.01

data

ndx <- structure(list(V1 = c(1L, 4L, 5L, 6L)), .Names = "V1", 
class = "data.frame", row.names = c(NA, -4L))

predict_all <- structure(list(V1 = c(0.01, 0.2), V2 = c(0, 0.01),
V3 = c(0.2, 0.1), V4 = c(0.4, 0.3), V5 = c(0.1, 0.6),
V6 = c(0, 0.3)), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6"), class = "data.frame",
row.names = c(NA, -2L))

How to select columns in data.table using a character vector of certain column names?

We can use .. notation to find myVector as a vector of column positions, like it would work in data.frame

mtcarsDT[, ..myVector]

According to ?data.table

In case of overlapping variables names inside dataset and in parent scope you can use double dot prefix ..cols to explicitly refer to 'cols variable parent scope and not from your dataset.

Using data.table in R, can I select a vector of variable columns from a table?

A vectorized way of subsetting the elements is by providing the row index and column index as a matrix

as.data.frame(A)[cbind(seq_len(nrow(A)), 1:3)]
#[1] 1 5 9

Or convert the .SD to matrix or data.frame and use the row/column index

A[, as.matrix(.SD)[cbind(1:3, 1:3)]]

Or in data.table, pass the i, j index in a loop and extract the elements

A[, unlist(Map(function(i, j) .SD[i, j, with = FALSE], 1:3, 1:3), 
use.names = FALSE)]

Select columns using a variable in R

We can use .. before the idx to select the columns in data.table or with = FALSE

library(data.table)
df[, ..idx]
df[, idx, with = FALSE]

Get column index from data frame that matches numeric vector?

Here's a base R approach, which compares every column in dat with testVec to see if they are identical. Use which to output the column index if they're identical.

which(sapply(1:ncol(dat), function(x) identical(dat[,x], testVec)))
[1] 3

UPDATE
@nicola has provided a better syntax to my original code (you can see it in the comment under this answer):

which(sapply(dat, identical, y = testVec))
z
3

Select columns by class (e.g. numeric) from a data.table

data.table needs the with=FALSE to grab column numbers.

tokeep <- which(sapply(x,is.numeric))
x[ , tokeep, with=FALSE]


Related Topics



Leave a reply



Submit