How to Remove Row If It Has a Na Value in One Certain Column

How to remove row if it has a NA value in one certain column

The easiest solution is to use is.na():

df[!is.na(df$B), ]

which gives you:

   A B  C
1 NA 2 NA
2 1 2 3
4 1 2 3

Omit rows containing specific column of NA

You could use the complete.cases function and put it into a function thusly:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}

completeFun(DF, "y")
# x y z
# 1 1 0 NA
# 2 2 10 33

completeFun(DF, c("y", "z"))
# x y z
# 2 2 10 33

EDIT: Only return rows with no NAs

If you want to eliminate all rows with at least one NA in any column, just use the complete.cases function straight up:

DF[complete.cases(DF), ]
# x y z
# 2 2 10 33

Or if completeFun is already ingrained in your workflow ;)

completeFun(DF, names(DF))

How to drop rows of Pandas DataFrame whose value in a certain column is NaN

Don't drop, just take the rows where EPS is not NA:

df = df[df['EPS'].notna()]

Remove rows where all columns except one have NA values?

We may use if_all in filter- select the columns a to b in if_all, apply the is.na (check for NA), the output will be TRUE for a row if both a and b have NA, negate (!) to convert TRUE-> FALSE and FALSE->TRUE

library(dplyr)
df %>%
filter(!if_all(a:b, is.na))

-output

ID    a    b
1 1 ab <NA>
2 1 <NA> ab

Or instead of negating (!), we may use complete.cases with if_any

df %>% 
filter(if_any(a:b, complete.cases))
ID a b
1 1 ab <NA>
2 1 <NA> ab

Regarding the issue in OP's code, the logic is created by looking whether there is atleast one NA (> 0) which is true for all the rows. Instead, it should be all NA and then negate

na_rows <- df %>% 
select(-"ID") %>%
is.na() %>%
{rowSums(.) == ncol(.)}

data

df <- structure(list(ID = c(1L, 1L, 1L), a = c("ab", NA, NA), b = c(NA, 
"ab", NA)), class = "data.frame", row.names = c(NA, -3L))

how to remove rows that contain NaN in both 1st and 3rd columns?

dropna has an additional parameter, how:

how{‘any’, ‘all’}, default ‘any’
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
‘any’ : If any NA values are present, drop that row or column.
‘all’ : If all values are NA, drop that row or column.

If you set it to all, it will only drop the lines that are filled with NaN. In your case df.dropna(subset=['b', 'd'], how="all") would work.

Delete rows if there are null values in a specific column in Pandas dataframe

If the relevant entries in Charge_Per_Line are empty (NaN) when you read into pandas, you can use df.dropna:

df = df.dropna(axis=0, subset=['Charge_Per_Line'])

If the values are genuinely -, then you can replace them with np.nan and then use df.dropna:

import numpy as np

df['Charge_Per_Line'] = df['Charge_Per_Line'].replace('-', np.nan)
df = df.dropna(axis=0, subset=['Charge_Per_Line'])

Remove entire Row if Column == NA in R

Here's a way to filter our rows with an NA in the name field:

library(dplyr)
df %>% filter(!is.na(name))

#> name GA SV
#> 1 CAREY.PRICE 3 2
#> 2 JOHN.SMITH 2 NA

Remove rows which have all NAs in certain columns

This a one-liner to remove the rows with NA in all columns between 5 and 9. By combining rowSums() with is.na() it is easy to check whether all entries in these 5 columns are NA:

x <- x[rowSums(is.na(x[,5:9]))!=5,]


Related Topics



Leave a reply



Submit