removing a list of columns from a data.frame using subset
I would probably do this like so:
to.remove <- c("hp","drat","wt","qsec")
`%ni%` <- Negate(`%in%`)
subset(mtcars,select = names(mtcars) %ni% to.remove)
(I use %ni%
a lot, so I have it built into my .Rprofile already.)
How to drop columns by name in a data frame
You should use either indexing or the subset
function. For example :
R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8)
R> df
x y z u
1 1 2 3 4
2 2 3 4 5
3 3 4 5 6
4 4 5 6 7
5 5 6 7 8
Then you can use the which
function and the -
operator in column indexation :
R> df[ , -which(names(df) %in% c("z","u"))]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Or, much simpler, use the select
argument of the subset
function : you can then use the -
operator directly on a vector of column names, and you can even omit the quotes around the names !
R> subset(df, select=-c(z,u))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Note that you can also select the columns you want instead of dropping the others :
R> df[ , c("x","y")]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
R> subset(df, select=c(x,y))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Drop data frame columns by name
There's also the subset
command, useful if you know which columns you want:
df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))
UPDATED after comment by @hadley: To drop columns a,c you could do:
df <- subset(df, select = -c(a, c))
How to remove rows from a data frame using a subset?
Data:
df <- data.frame(SFC = c("YU006UGD31092","YU006UGD31071",
"YU006UGD30152",
"YU006UGD25831",
"YU006UGD25831",
"YU006UGD25332" ,
"YU006UG922912",
"YU006UG922912"))
Code:
df %>%
group_by(SFC) %>%
filter(n() == 1)
Output:
SFC
<chr>
1 YU006UGD31092
2 YU006UGD31071
3 YU006UGD30152
4 YU006UGD25332
Edit:
If you have the list, you can also do:
df %>%
filter(!(SFC %in% Remove_SFC))
Creating a function to remove columns with different names from a list of dataframes
You can try:
#Function
remove_col <- function(df,name){
vec <- which(names(df) %in% name)
df = df[,-vec]
return(df)
}
df_list <- lapply(df_list, remove_col,name=c('X', 'X..x', 'X..y'))
$df1
var1 var2
1 a 1
2 b 1
3 c 0
4 d 0
5 e 1
$df2
var1 var2
1 f 0
2 g 1
3 h 0
4 i 1
5 j 1
How do you remove columns from a data.frame?
I use data.table's :=
operator to delete columns instantly regardless of the size of the table.
DT[, coltodelete := NULL]
or
DT[, c("col1","col20") := NULL]
or
DT[, (125:135) := NULL]
or
DT[, (variableHoldingNamesOrNumbers) := NULL]
Any solution using <-
or subset
will copy the whole table. data.table's :=
operator merely modifies the internal vector of pointers to the columns, in place. That operation is therefore (almost) instant.
Related Topics
Aggregating Multiple Columns in Data.Table
Using Get Inside Lapply, Inside a Function
Making a Zip Code Choropleth in R Using Ggplot2 and Ggmap
Remove the Rows That Have Non-Numeric Characters in One Column in R
How to Call the 'Function' Function
How to Make Variable Available to Namespace at Loading Time
Extract Time Series of a Point ( Lon, Lat) from Netcdf in R
Changing Tick Intervals When X Axis Values Are Dates
Using Strsplit and Subset in Dplyr and Mutate
Remove Consecutive Duplicates from Dataframe
Clear Memory Allocated by R Session (Gc() Doesnt Help !)
Find *All* Duplicated Records in Data.Table (Not All-But-One)
How to Prevent Objects from Automatically Loading When I Open Rstudio
How to Calculate the Area of Polygon Overlap in R
R: How to Make a Barplot with Labels Parallel (Horizontal) to Bars