Remove an entire column from a data.frame in R
You can set it to NULL
.
> Data$genome <- NULL
> head(Data)
chr region
1 chr1 CDS
2 chr1 exon
3 chr1 CDS
4 chr1 exon
5 chr1 CDS
6 chr1 exon
As pointed out in the comments, here are some other possibilities:
Data[2] <- NULL # Wojciech Sobala
Data[[2]] <- NULL # same as above
Data <- Data[,-2] # Ian Fellows
Data <- Data[-2] # same as above
You can remove multiple columns via:
Data[1:2] <- list(NULL) # Marek
Data[1:2] <- NULL # does not work!
Be careful with matrix-subsetting though, as you can end up with a vector:
Data <- Data[,-(2:3)] # vector
Data <- Data[,-(2:3),drop=FALSE] # still a data.frame
Drop data frame columns by name
There's also the subset
command, useful if you know which columns you want:
df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))
UPDATED after comment by @hadley: To drop columns a,c you could do:
df <- subset(df, select = -c(a, c))
How do I delete columns in R data frame
We can use setdiff
to get all the columns except the 'year' and 'category'.
df1 <- df[setdiff(colnames(df), c('year', 'category'))]
df1
# vin make model
#1 1 A D
#2 2 B E
#3 3 C F
Including the comments from @Frank and @Ben Bolker.
We can assign the columns to NULL
df$year <- NULL
df$category <- NULL
Or use subset
from base R
or select
from dplyr
subset(df, select=-c(year, category))
library(dplyr)
select(df, -year, -category)
data
df <- data.frame(vin=1:3, make=LETTERS[1:3], model=LETTERS[4:6],
year=1991:1993, category= paste0('GR', 1:3))
How do you delete a column by name in data.table?
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo"]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4:
df3[, .SD, .SDcols = !patterns("^foo$")]
Lastly there are approaches using with=FALSE
, though data.table
is gradually moving away from using this argument so it's now discouraged where you can avoid it; showing here so you know the option exists in case you really do need it:
# Method 5a (like Method 3)
df3[, !"foo", with=FALSE]
# Method 5b (like Method 4)
df3[, !grep("^foo$", names(df3)), with=FALSE]
# Method 5b (another like Method 4)
df3[, !grepl("^foo$", names(df3)), with=FALSE]
How to drop columns by name in a data frame
You should use either indexing or the subset
function. For example :
R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8)
R> df
x y z u
1 1 2 3 4
2 2 3 4 5
3 3 4 5 6
4 4 5 6 7
5 5 6 7 8
Then you can use the which
function and the -
operator in column indexation :
R> df[ , -which(names(df) %in% c("z","u"))]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Or, much simpler, use the select
argument of the subset
function : you can then use the -
operator directly on a vector of column names, and you can even omit the quotes around the names !
R> subset(df, select=-c(z,u))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Note that you can also select the columns you want instead of dropping the others :
R> df[ , c("x","y")]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
R> subset(df, select=c(x,y))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
How to delete a column in R dataframe
This: y$B <- NULL
removes column B
from dataframe y
.
How do you remove columns from a data.frame?
I use data.table's :=
operator to delete columns instantly regardless of the size of the table.
DT[, coltodelete := NULL]
or
DT[, c("col1","col20") := NULL]
or
DT[, (125:135) := NULL]
or
DT[, (variableHoldingNamesOrNumbers) := NULL]
Any solution using <-
or subset
will copy the whole table. data.table's :=
operator merely modifies the internal vector of pointers to the columns, in place. That operation is therefore (almost) instant.
Remove selected columns in R dataframe
###Use subset command:###
dataframename <- subset(dataframename, select = -c(col1,col4) )
###One more approach is you can use list(NULL) to the dataframe:###
dataframename[,c("col1","col4")] <- list(NULL)
Related Topics
Read Multiple CSV Files into Separate Data Frames
Using Regex in R to Find Strings as Whole Words (But Not Strings as Part of Words)
Ggplot, Facet, Piechart: Placing Text in the Middle of Pie Chart Slices
Error in ≪My Code≫: Target of Assignment Expands to Non-Language Object
Increment by 1 For Every Change in Column
Assign Multiple Objects to .Globalenv from Within a Function
How to Send an Email With Attachment from R in Windows
Get Specific Object from Rdata File
Rep() With Each Equals a Vector
Method to Extract Stat_Smooth Line Fit
Calculating Cumulative Sum For Each Row
Calculate the Mean For Each Column of a Matrix in R
Identify Groups of Linked Episodes Which Chain Together
How to Calculate Mean/Median Per Group in a Dataframe in R
Using Functions of Multiple Columns in a Dplyr Mutate_At Call