R: Deleting rows based on a value in a column from a large data set in R
I suggest you learn how to use dplyr
, and other packages in the tidyverse
. I find them to be an indispensable tool in cleaning data.
Here's how I would use dplyr
to filter out both Texas and New York in your data set:
library(dplyr)
customers = filter(customers, State != "TX" & State != "NY")
Alternatively,
customers = filter(customers, !(State %in% c("TX", "NY")))
Removing rows in R based on values in a single column
You could also use the subset()
function.
a <- matrix(1:9, nrow=3)
threshhold <- 8
subset(a, a[ , 3] < threshhold)
How to remove row if it has a NA value in one certain column
The easiest solution is to use is.na()
:
df[!is.na(df$B), ]
which gives you:
A B C
1 NA 2 NA
2 1 2 3
4 1 2 3
How can I delete rows if a column contains a certain value?
It's better to think "how do I create an object in the form I want", than "how do I manipulate this object in place".
So you can use the following syntax:
df <- df[!df$classification == "D1" | df$classification == "RD", ]
or, the slightly more easy to maintain:
df <- df[!df$classification %in% c("D1", "RD"), ]
Remove rows from a single-column data frame
Try adding the drop = FALSE
option:
R> df[-(length(df[,1])), , drop = FALSE]
a
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
R: removing rows based on row value in a column of a data frame
EDIT I reproduced your error, you need to add the drop = FALSE
option in your subsetting to get a data.frame as result and not a vector :
df_a <- structure(list(order..new..i...2.ncol.new..i..... = c(620L, 2851L, 1972L, 565L, 1025L, 2509L)), row.names = c("J.TYMO", "J.TTMO", "J.NTT", "J.ABOT", "J.NNDO", "J.SFTB"), class = "data.frame")
str(df_a)
#> 'data.frame': 6 obs. of 1 variable:
#> $ order..new..i...2.ncol.new..i.....: int 620 2851 1972 565 1025 2509
names(df_a) <- "V1"
df_a[df_a[[1]] <= 1000 , , drop = FALSE]
#> V1
#> J.TYMO 620
#> J.ABOT 565
OLD ANSWER
The best with-row-names-dataset I though of was the mtcars
dataset. Building from that I found that adding a comma in your call solves the problem :
dfr <- head(mtcars)
dfr
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
as.data.frame(dfr[dfr[1]<20 , ])
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Hornet Sportabout 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
Thus with your peculiar a
object, if it is a data.frame the answer should be :
as.data.frame(a[a[1] <= 333 , ])
remove R Dataframe rows based on zero values in one column
Just subset the data frame based on the value in the No_of_Mails
column:
df[df$No_of_Mails != 0, ]
Demo
Related Topics
How to Rank Within Groups in R
How to Change Font Size of the Correlation Coefficient in Corrplot
Remove Fill Around Legend Key in Ggplot
Update Graph/Plot with Fixed Interval of Time
How to Write a Function That Calls a Function That Calls Data.Table
Plotting Envfit Vectors (Vegan Package) in Ggplot2
Differencebetween These Two Comparisons
How to Open an .Xlsb File in R
Adding Time to Posixct Object in R
R: Numeric 'Envir' Arg Not of Length One in Predict()
Differencebetween a List and a Pairlist in R
R: Determine If a Script Is Running in Windows or Linux
Factor Order Within Faceted Dotplot Using Ggplot2
Plot Logistic Regression Curve in R
How to Convert a String in a Function into an Object