Omit rows containing specific column of NA
You could use the complete.cases
function and put it into a function thusly:
DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
completeFun(DF, "y")
# x y z
# 1 1 0 NA
# 2 2 10 33
completeFun(DF, c("y", "z"))
# x y z
# 2 2 10 33
EDIT: Only return rows with no NA
s
If you want to eliminate all rows with at least one NA
in any column, just use the complete.cases
function straight up:
DF[complete.cases(DF), ]
# x y z
# 2 2 10 33
Or if completeFun
is already ingrained in your workflow ;)
completeFun(DF, names(DF))
How to remove row if it has a NA value in one certain column
The easiest solution is to use is.na()
:
df[!is.na(df$B), ]
which gives you:
A B C
1 NA 2 NA
2 1 2 3
4 1 2 3
Remove rows where all columns except one have NA values?
We may use if_all
in filter
- select the columns a to b in if_all
, apply the is.na
(check for NA), the output will be TRUE for a row if both a and b have NA, negate (!
) to convert TRUE-> FALSE and FALSE->TRUE
library(dplyr)
df %>%
filter(!if_all(a:b, is.na))
-output
ID a b
1 1 ab <NA>
2 1 <NA> ab
Or instead of negating (!
), we may use complete.cases
with if_any
df %>%
filter(if_any(a:b, complete.cases))
ID a b
1 1 ab <NA>
2 1 <NA> ab
Regarding the issue in OP's code, the logic is created by looking whether there is atleast one NA (> 0
) which is true for all the rows. Instead, it should be all NA and then negate
na_rows <- df %>%
select(-"ID") %>%
is.na() %>%
{rowSums(.) == ncol(.)}
data
df <- structure(list(ID = c(1L, 1L, 1L), a = c("ab", NA, NA), b = c(NA,
"ab", NA)), class = "data.frame", row.names = c(NA, -3L))
How to remove NA data in only one columns?
Use is.na()
on the relevant vector of data you wish to look for and index using the negated result. For exmaple:
R> data[!is.na(data$A), ]
date A B
1 2014-01-01 2 3
2 2014-01-02 5 NA
4 2014-01-04 7 11
R> data[!is.na(data$B), ]
date A B
1 2014-01-01 2 3
4 2014-01-04 7 11
is.na()
returns TRUE
for every element that is NA
and FALSE
otherwise. To index the rows of the data frame, we can use this logical vector, but we want its converse. Hence we use !
to imply the opposite (TRUE
becomes FALSE
and vice versa).
You can restrict which columns you return by adding an index for the columns after the ,
in [ , ]
, e.g.
R> data[!is.na(data$A), 1:2]
date A
1 2014-01-01 2
2 2014-01-02 5
4 2014-01-04 7
R remove NA values from 3 columns only when all 3 have NA
The complete.cases
code can be with |
condition as complete.cases
returns TRUE for a non-NA value and FALSE for NA
. Thus, by using the OR
, we are subsetting a row having at least one non-NA
data[complete.cases(data$A) | complete.cases(data$B) | complete.cases(data$C),]
Or more easily with rowSums
data[rowSums(is.na(data[, c("A", "B", "C")])) < 3,]
Or with dplyr
with if_all
or if_any
library(dplyr)
data %>%
filter(!if_all(c(A, B, C), is.na))
Related Topics
Removing Traces by Name Using Plotlyproxy (Or Accessing Output Schema in Reactive Context)
How to Add Abline with Lattice Xyplot Function
Ggplot2 One Line Per Each Row Dataframe
Match Dataframes Excluding Last Non-Na Value and Disregarding Order
Behavior of Summing !Is.Na() Results
Rgdal Installation Difficulty on Ubuntu 16.04 Lts
Resetting Cumsum If Value Goes to Negative in R
Applying Some Functions to Multiple Objects
Order of Dates Is Not Chronological in Ggplot2
Calculate Summary Statistics (E.G. Mean) on All Numeric Columns Using Data.Table
Using Rollmean When There Are Missing Values (Na)
Data.Table - Left Outer Join on Multiple Tables
What If I Want to Web Scrape with R for a Page with Parameters
Return a List in Dplyr Mutate()
Axis Labels for Each Bar and Each Group in Bar Charts with Dodged Groups