Apply a Function to a Subset of Data.Table Columns, by Column-Indices Instead of Name

Apply a function to a subset of data.table columns, by column-indices instead of name

The idiomatic approach is to use .SD and .SDcols

You can force the RHS to be evaluated in the parent frame by wrapping in ()

a[, (b) := lapply(.SD, as.numeric), .SDcols = b]

For columns 2:3

a[, 2:3 := lapply(.SD, as.numeric), .SDcols = 2:3]

or

mysubset <- 2:3
a[, (mysubset) := lapply(.SD, as.numeric), .SDcols = mysubset]

data.table: transforming subset of columns with a function, row by row

If what you need is really to scale by row, you can try doing it in 2 steps:

# compute mean/sd:
mean_sd <- DT[, .(mean(unlist(.SD)), sd(unlist(.SD))), by=1:nrow(DT), .SDcols=grep("keyword",colnames(DT))]

# scale
DT[, grep("keyword",colnames(DT), value=TRUE) := lapply(.SD, function(x) (x-mean_sd$V1)/mean_sd$V2), .SDcols=grep("keyword",colnames(DT))]

Select subset of columns in data.table R

Use with=FALSE:

cols = paste("V", c(1,2,3,5), sep="")

dt[, !cols, with=FALSE]

I suggest going through the "Introduction to data.table" vignette.


Update: From v1.10.2 onwards, you can also do:

dt[, ..cols]

See the first NEWS item under v1.10.2 here for additional explanation.

data.table in r : subset using column index

We can get the row index with .I and use that to subset the DT

DT[DT[, .I[.SD==2], .SDcols = 1]]
# A B C
#1: 2 3 4

data

DT <- data.table(A = 1:5, B = 2:6, C = 3:7)

Selecting a subset of columns in a data.table

Use a very similar syntax as for a data.frame, but add the argument with=FALSE:

dt[, setdiff(colnames(dt),"V9"), with=FALSE]
V1 V2 V3 V4 V5 V6 V7 V8 V10
1: 1 1 1 1 1 1 1 1 1
2: 0 0 0 0 0 0 0 0 0
3: 1 1 1 1 1 1 1 1 1
4: 0 0 0 0 0 0 0 0 0
5: 0 0 0 0 0 0 0 0 0
6: 1 1 1 1 1 1 1 1 1

The use of with=FALSE is nicely explained in the documentation for the j argument in ?data.table:

j: A single column name, single expresson of column names, list() of expressions of column names, an expression or function call that evaluates to list (including data.frame and data.table which are lists, too), or (when with=FALSE) same as j in [.data.frame.


From v1.10.2 onwards it is also possible to do this as follows:

keep <- setdiff(names(dt), "V9")
dt[, ..keep]

Prefixing a symbol with .. will look up in calling scope (i.e. the Global Environment) and its value taken to be column names or numbers (source).

data.table assignment by reference using lapply and also returning the rest of the columns

Try

x[,  c("a", "b") := lapply(.SD, overwriteNA), .SDcols = c("a", "b")]

Edit:

Per OPs additional request.

myCols <- c("a", "b")  
x[, (myCols) := lapply(.SD, overwriteNA), .SDcols = myCols]

Select multiple columns in data.table by their numeric indices

For versions of data.table >= 1.9.8, the following all just work:

library(data.table)
dt <- data.table(a = 1, b = 2, c = 3)

# select single column by index
dt[, 2]
# b
# 1: 2

# select multiple columns by index
dt[, 2:3]
# b c
# 1: 2 3

# select single column by name
dt[, "a"]
# a
# 1: 1

# select multiple columns by name
dt[, c("a", "b")]
# a b
# 1: 1 2

For versions of data.table < 1.9.8 (for which numerical column selection required the use of with = FALSE), see this previous version of this answer. See also NEWS on v1.9.8, POTENTIALLY BREAKING CHANGES, point 3.

Extract columns from data table by numeric indices stored in a vector

We can use double dots (..) before the object 'a' to extract the columns

dt[, ..a]
# col4 col5 col6
#1: 4 5 6
#2: 5 6 7
#3: 6 7 8
#4: 7 8 9

Or another option is with = FALSE

dt[, a, with = FALSE]

data

dt <- data.table(col1 = 1:4, col2 = 2:5, col3 = 3:6, col4 = 4:7, col5 = 5:8, col6 = 6:9)

Selecting columns of a data.table using a vector of column names or column positions without using with = F

An option is to use double dots

DT[, ..mycols]
# A C
#1: 0.1188208 -0.17328827
#2: -0.5622505 0.84231231
#3: 0.8111072 -1.59802306
#4: 0.7968823 2.08468489
# ...

Or specify it in .SDcols

DT[, .SD, .SDcols = mycols]

or else with = FALSE as the OP mentioned in the post



Related Topics



Leave a reply



Submit