Extract a column from a data.table as a vector, by position
A data.table inherits from class data.frame
. Therefore it is a list
(of column vectors) internally and can be treated as such.
is.list(DT)
#[1] TRUE
Fortunately, list subsetting, i.e. [[
, is very fast and, in contrast to [
, package data.table doesn't define a method for it. Thus, you can simply use [[
to extract by an index:
DT[[2]]
#[1] 3 4
Extract a column by reference from a data.table as a vector
We can use the [[
to extract the column as vector
is.vector(DT[[col]])
#[1] TRUE
Extract columns from data table by numeric indices stored in a vector
We can use double dots (..
) before the object 'a' to extract the columns
dt[, ..a]
# col4 col5 col6
#1: 4 5 6
#2: 5 6 7
#3: 6 7 8
#4: 7 8 9
Or another option is with = FALSE
dt[, a, with = FALSE]
data
dt <- data.table(col1 = 1:4, col2 = 2:5, col3 = 3:6, col4 = 4:7, col5 = 5:8, col6 = 6:9)
Convert data.frame column to a vector?
I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.
A data frame is a list. When you subset a data frame using the name of a column and [
, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[
, or somewhat confusingly (to me) you could do aframe[,2]
which returns a vector, not a sublist.
So try running this sequence and maybe things will be clearer:
avector <- as.vector(aframe['a2'])
class(avector)
avector <- aframe[['a2']]
class(avector)
avector <- aframe[,2]
class(avector)
Select column of data.table and return vector
With data.frame
, the default is drop = TRUE
and in data.table
, it is the opposite while it is done internally. According to ?data.table
drop - Never used by data.table. Do not use. It needs to be here because data.table inherits from data.frame.
In order to get the same behavior, we can use [[
to extract the column by passing a string
identical(dat[["Species"]], iris[, "Species"])
#[1] TRUE
Or
dat$Species
By using [[
or $
, it extracts as a vector
while also bypass the data.table
overhead
How to select columns in data.table using a character vector of certain column names?
We can use ..
notation to find myVector
as a vector of column positions, like it would work in data.frame
mtcarsDT[, ..myVector]
According to ?data.table
In case of overlapping variables names inside dataset and in parent scope you can use double dot prefix
..cols
to explicitly refer to 'cols variable parent scope and not from your dataset.
Selecting columns of a data.table using a vector of column names or column positions without using with = F
An option is to use double dots
DT[, ..mycols]
# A C
#1: 0.1188208 -0.17328827
#2: -0.5622505 0.84231231
#3: 0.8111072 -1.59802306
#4: 0.7968823 2.08468489
# ...
Or specify it in .SDcols
DT[, .SD, .SDcols = mycols]
or else with = FALSE
as the OP mentioned in the post
Split a data.table at position
You can use findInterval
/cut
to create groups based on pos
:
library(data.table)
x[, mean(a), findInterval(a, pos)]
# findInterval V1
#1: 0 1.5
#2: 1 3.5
#3: 2 6.5
Related Topics
Merge by Range in R - Applying Loops
How to Count True Values in a Logical Vector
Duplicate 'Row.Names' Are Not Allowed Error
How to Parametrize Function Calls in Dplyr 0.7
Using Stargazer with Rstudio and Knitr
Making a Stacked Area Plot Using Ggplot2
How to Sort a Data Frame by Date
How to See Data from .Rdata File
Way to Securely Give a Password to R Application from the Terminal
Embedded Nul in String' Error When Importing CSV with Fread
How to Arrange an Arbitrary Number of Ggplots Using Grid.Arrange
Legend Placement, Ggplot, Relative to Plotting Region
Extract a Column from a Data.Table as a Vector, by Position
Output a Vector in R in the Same Format Used for Inputting It into R
R - Converting Date and Time Fields to Posixct with Hhmmss Format
Using Row-Wise Column Indices in a Vector to Extract Values from Data Frame