Remove Rows With All or Some Nas (Missing Values) in Data.Frame

Remove rows with all or some NAs (missing values) in data.frame

Also check complete.cases :

> final[complete.cases(final), ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
6 ENSG00000221312 0 1 2 3 2

na.omit is nicer for just removing all NA's. complete.cases allows partial selection by including only certain columns of the dataframe:

> final[complete.cases(final[ , 5:6]),]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2

Your solution can't work. If you insist on using is.na, then you have to do something like:

> final[rowSums(is.na(final[ , 5:6])) == 0, ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2

but using complete.cases is quite a lot more clear, and faster.

remove Rows with complete set of NA

We can use dplyr. With the example by @lovalery:

library(dplyr)

df %>% filter(!if_all(V2:V3, is.na))

#> V1 V2 V3
#> 1 3 3 NA
#> 2 NA 1 NA
#> 3 3 5 NA

We can use many different selection statements inside if_all. Check the documentation for more examples.

How to remove row if it has a NA value in one certain column

The easiest solution is to use is.na():

df[!is.na(df$B), ]

which gives you:

   A B  C
1 NA 2 NA
2 1 2 3
4 1 2 3

Remove rows where all columns except one have NA values?

We may use if_all in filter- select the columns a to b in if_all, apply the is.na (check for NA), the output will be TRUE for a row if both a and b have NA, negate (!) to convert TRUE-> FALSE and FALSE->TRUE

library(dplyr)
df %>%
filter(!if_all(a:b, is.na))

-output

ID    a    b
1 1 ab <NA>
2 1 <NA> ab

Or instead of negating (!), we may use complete.cases with if_any

df %>% 
filter(if_any(a:b, complete.cases))
ID a b
1 1 ab <NA>
2 1 <NA> ab

Regarding the issue in OP's code, the logic is created by looking whether there is atleast one NA (> 0) which is true for all the rows. Instead, it should be all NA and then negate

na_rows <- df %>% 
select(-"ID") %>%
is.na() %>%
{rowSums(.) == ncol(.)}

data

df <- structure(list(ID = c(1L, 1L, 1L), a = c("ab", NA, NA), b = c(NA, 
"ab", NA)), class = "data.frame", row.names = c(NA, -3L))

Deleting rows with missing data. How to omit rows from a data frame with missing values in either column

There are a few good ways of doing this - which have been well described elsewhere on SO (e.g. here). However, to use your example here:

I think na.omit is probably the simplest option for your purpose:

na.omit(DF)

# rater.1 rater.2
# 1 1 1
# 4 3 2
# 5 2 3

There's also complete.cases which is a bit longer but allows you to restrict the NA search to specific columns. While this wasn't required in this question, for completeness it might help to know. For example if you only wanted to remove rows with NA in rater.1:

DF[complete.cases(DF$rater.1),]

# rater.1 rater.2
# 1 1 1
# 2 4 NA
# 4 3 2
# 5 2 3

Also tidyr has drop_na which might be the easiest if you're already operating in the tidyverse and also has the same benefit as using complete.cases:

library(tidyverse)
DF %>% tidyr::drop_na(rater.1)

# rater.1 rater.2
# 1 1 1
# 2 4 NA
# 3 3 2
# 4 2 3

How to remove rows with a NA value?

dat <- data.frame(x1 = c(1,2,3, NA, 5), x2 = c(100, NA, 300, 400, 500))

na.omit(dat)
x1 x2
1 1 100
3 3 300
5 5 500


Related Topics



Leave a reply



Submit