How to remove row if it has a NA value in one certain column
The easiest solution is to use is.na()
:
df[!is.na(df$B), ]
which gives you:
A B C
1 NA 2 NA
2 1 2 3
4 1 2 3
Omit rows containing specific column of NA
You could use the complete.cases
function and put it into a function thusly:
DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
completeFun(DF, "y")
# x y z
# 1 1 0 NA
# 2 2 10 33
completeFun(DF, c("y", "z"))
# x y z
# 2 2 10 33
EDIT: Only return rows with no NA
s
If you want to eliminate all rows with at least one NA
in any column, just use the complete.cases
function straight up:
DF[complete.cases(DF), ]
# x y z
# 2 2 10 33
Or if completeFun
is already ingrained in your workflow ;)
completeFun(DF, names(DF))
How to drop rows of Pandas DataFrame whose value in a certain column is NaN
Don't drop, just take the rows where EPS is not NA:
df = df[df['EPS'].notna()]
Remove rows where all columns except one have NA values?
We may use if_all
in filter
- select the columns a to b in if_all
, apply the is.na
(check for NA), the output will be TRUE for a row if both a and b have NA, negate (!
) to convert TRUE-> FALSE and FALSE->TRUE
library(dplyr)
df %>%
filter(!if_all(a:b, is.na))
-output
ID a b
1 1 ab <NA>
2 1 <NA> ab
Or instead of negating (!
), we may use complete.cases
with if_any
df %>%
filter(if_any(a:b, complete.cases))
ID a b
1 1 ab <NA>
2 1 <NA> ab
Regarding the issue in OP's code, the logic is created by looking whether there is atleast one NA (> 0
) which is true for all the rows. Instead, it should be all NA and then negate
na_rows <- df %>%
select(-"ID") %>%
is.na() %>%
{rowSums(.) == ncol(.)}
data
df <- structure(list(ID = c(1L, 1L, 1L), a = c("ab", NA, NA), b = c(NA,
"ab", NA)), class = "data.frame", row.names = c(NA, -3L))
how to remove rows that contain NaN in both 1st and 3rd columns?
dropna
has an additional parameter, how
:
how{‘any’, ‘all’}, default ‘any’
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
‘any’ : If any NA values are present, drop that row or column.
‘all’ : If all values are NA, drop that row or column.
If you set it to all
, it will only drop the lines that are filled with NaN. In your case df.dropna(subset=['b', 'd'], how="all")
would work.
Delete rows if there are null values in a specific column in Pandas dataframe
If the relevant entries in Charge_Per_Line are empty (NaN
) when you read into pandas, you can use df.dropna
:
df = df.dropna(axis=0, subset=['Charge_Per_Line'])
If the values are genuinely -
, then you can replace them with np.nan
and then use df.dropna
:
import numpy as np
df['Charge_Per_Line'] = df['Charge_Per_Line'].replace('-', np.nan)
df = df.dropna(axis=0, subset=['Charge_Per_Line'])
Remove entire Row if Column == NA in R
Here's a way to filter our rows with an NA in the name
field:
library(dplyr)
df %>% filter(!is.na(name))
#> name GA SV
#> 1 CAREY.PRICE 3 2
#> 2 JOHN.SMITH 2 NA
Remove rows which have all NAs in certain columns
This a one-liner to remove the rows with NA in all columns between 5 and 9. By combining rowSums()
with is.na()
it is easy to check whether all entries in these 5 columns are NA
:
x <- x[rowSums(is.na(x[,5:9]))!=5,]
Related Topics
Transparent Equivalent of Given Color
Emacs Ess Mode - Tabbing for Comment Region
Replace Accented Characters in R with Non-Accented Counterpart (Utf-8 Encoding)
Ggplot2: Geom_Text() with Facet_Grid()
Shade Region Between Two Lines with Ggplot
Difference Between Read.Csv() and Read.Csv2() in R
Add One Column Below Another in a Data.Frame in R
How to Create Design Matrix in R
How to Change a Single Value in a Data.Frame
Fastest Way to Multiply Matrix Columns with Vector Elements in R
Filter Out Rows from One Data.Frame That Are Present in Another Data.Frame
Shiny Saving Url State Subpages and Tabs
Update Graph/Plot with Fixed Interval of Time
How to Remove Unique Entry and Keep Duplicates in R