Remove rows with all or some NAs (missing values) in data.frame
Also check complete.cases
:
> final[complete.cases(final), ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
6 ENSG00000221312 0 1 2 3 2
na.omit
is nicer for just removing all NA
's. complete.cases
allows partial selection by including only certain columns of the dataframe:
> final[complete.cases(final[ , 5:6]),]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
Your solution can't work. If you insist on using is.na
, then you have to do something like:
> final[rowSums(is.na(final[ , 5:6])) == 0, ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
but using complete.cases
is quite a lot more clear, and faster.
remove Rows with complete set of NA
We can use dplyr. With the example by @lovalery:
library(dplyr)
df %>% filter(!if_all(V2:V3, is.na))
#> V1 V2 V3
#> 1 3 3 NA
#> 2 NA 1 NA
#> 3 3 5 NA
We can use many different selection statements inside if_all
. Check the documentation for more examples.
How to remove row if it has a NA value in one certain column
The easiest solution is to use is.na()
:
df[!is.na(df$B), ]
which gives you:
A B C
1 NA 2 NA
2 1 2 3
4 1 2 3
Remove rows where all columns except one have NA values?
We may use if_all
in filter
- select the columns a to b in if_all
, apply the is.na
(check for NA), the output will be TRUE for a row if both a and b have NA, negate (!
) to convert TRUE-> FALSE and FALSE->TRUE
library(dplyr)
df %>%
filter(!if_all(a:b, is.na))
-output
ID a b
1 1 ab <NA>
2 1 <NA> ab
Or instead of negating (!
), we may use complete.cases
with if_any
df %>%
filter(if_any(a:b, complete.cases))
ID a b
1 1 ab <NA>
2 1 <NA> ab
Regarding the issue in OP's code, the logic is created by looking whether there is atleast one NA (> 0
) which is true for all the rows. Instead, it should be all NA and then negate
na_rows <- df %>%
select(-"ID") %>%
is.na() %>%
{rowSums(.) == ncol(.)}
data
df <- structure(list(ID = c(1L, 1L, 1L), a = c("ab", NA, NA), b = c(NA,
"ab", NA)), class = "data.frame", row.names = c(NA, -3L))
Deleting rows with missing data. How to omit rows from a data frame with missing values in either column
There are a few good ways of doing this - which have been well described elsewhere on SO (e.g. here). However, to use your example here:
I think na.omit
is probably the simplest option for your purpose:
na.omit(DF)
# rater.1 rater.2
# 1 1 1
# 4 3 2
# 5 2 3
There's also complete.cases
which is a bit longer but allows you to restrict the NA search to specific columns. While this wasn't required in this question, for completeness it might help to know. For example if you only wanted to remove rows with NA
in rater.1
:
DF[complete.cases(DF$rater.1),]
# rater.1 rater.2
# 1 1 1
# 2 4 NA
# 4 3 2
# 5 2 3
Also tidyr
has drop_na
which might be the easiest if you're already operating in the tidyverse
and also has the same benefit as using complete.cases
:
library(tidyverse)
DF %>% tidyr::drop_na(rater.1)
# rater.1 rater.2
# 1 1 1
# 2 4 NA
# 3 3 2
# 4 2 3
How to remove rows with a NA value?
dat <- data.frame(x1 = c(1,2,3, NA, 5), x2 = c(100, NA, 300, 400, 500))
na.omit(dat)
x1 x2
1 1 100
3 3 300
5 5 500
Related Topics
How to Add Row and Column to a Dataframe of Different Length
Regex to Replace Comma to Dot Separator
Multiplying All Columns in Dataframe by Single Column
How Does the 'Prop.Table()' Function Work in R
How to Join (Merge) Data Frames (Inner, Outer, Left, Right)
Calculate Group Mean, Sum, or Other Summary Stats. and Assign Column to Original Data
Pass a Data.Frame Column Name to a Function
Data.Table VS Dplyr: Can One Do Something Well the Other Can't or Does Poorly
Run R Script from Command Line
Data.Table Objects Assigned With := from Within Function Not Printed
If Else Statements to Check If a String Contains a Substring in R
Remove Space Between Plotted Data and the Axes
How to Find the Difference in Value in Every Two Consecutive Rows in R
I Want to Split Street Address into Two Columns. One With Street Number Other With Street Name
How to Reshape Data from Long to Wide Format
Use Dynamic Name For New Column/Variable in 'Dplyr'
Error in If/While (Condition) {: Missing Value Where True/False Needed