How to Remove Multiple Columns in R Dataframe

How to remove multiple columns from a dataframe with column list

select_ is deprecated as of dplyr 0.7. See the select_ docs for more info.

I believe the new recommended approach is to use a select helper verbs.

Using shadow 's example. it would be:
select(dataframe, -one_of(c("a", "b"))

Deleting multiple columns in R

With dplyr you could do it like this:

library(dplyr)
df <- select(df, -(M50:M100))

This removes all columns between column "M50" and column "M100".

A different option, that does not depend on the order of columns is to use

df <- select(df, -num_range("M", 50:100))

How to remove multiple columns every nth column in R?

As these are index, use - to remove those columns

i1 <- rep(seq(3, ncol(df), 4) , each = 2) + 0:1
df[,-i1]

Or another option is to use a logical index to recycle

df[!c(FALSE, FALSE, TRUE, TRUE)]

data

set.seed(24)
df <- as.data.frame(matrix(rnorm(12 * 4), 4, 12))

R dplyr: Drop multiple columns

Check the help on select_vars. That gives you some extra ideas on how to work with this.

In your case:

iris %>% select(-one_of(drop.cols))

Remove multiple columns and replace values of columns of dataframe based on condition in R

Here's a similar approach (perhaps more vectorized?)

is.na(df[-1]) <- df[-1] < 1 # Convert all values < 1 to NAs.
df[colSums(is.na(df)) != nrow(df)] # Select only the columns that have values.
# Date A C
# 1 01/01/2000 NA NA
# 2 02/01/2000 NA NA
# 3 03/01/2000 NA NA
# 4 04/01/2000 NA NA
# 5 05/01/2000 5 NA
# 6 06/01/2000 6 1
# 7 07/01/2000 7 1
# 8 08/01/2000 8 NA
# 9 09/01/2000 9 NA

Or alternatively, second step could be

df[c(TRUE, colSums(df[-1], na.rm = TRUE) > 0)]
## OR
## df[c(TRUE, sapply(df[-1], sum, na.rm = TRUE) > 0)] # as already sugggested

Drop data frame columns by name

There's also the subset command, useful if you know which columns you want:

df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))

UPDATED after comment by @hadley: To drop columns a,c you could do:

df <- subset(df, select = -c(a, c))

How to drop columns by name in a data frame

You should use either indexing or the subset function. For example :

R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8)
R> df
x y z u
1 1 2 3 4
2 2 3 4 5
3 3 4 5 6
4 4 5 6 7
5 5 6 7 8

Then you can use the which function and the - operator in column indexation :

R> df[ , -which(names(df) %in% c("z","u"))]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

Or, much simpler, use the select argument of the subset function : you can then use the - operator directly on a vector of column names, and you can even omit the quotes around the names !

R> subset(df, select=-c(z,u))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

Note that you can also select the columns you want instead of dropping the others :

R> df[ , c("x","y")]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

R> subset(df, select=c(x,y))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6


Related Topics



Leave a reply



Submit