Count Nas Per Row in Dataframe

Count NAs per row in dataframe

You could add a new column to your data frame containing the number of NA values per batch_id:

df$na_count <- apply(df, 1, function(x) sum(is.na(x)))

How to count number of NAs per row, with conditions

Both can be done with rowSums in the second case subset df to the desired columns.

rowSums(is.na(df))
# [1] 3 1 3 2 2 0 2 0 0 1 2 2 2 1 1 2 0 1 2 1 0

rowSums(is.na(df[2:5]))
# [1] 2 1 1 1 2 0 1 0 0 1 1 1 1 1 0 2 0 0 1 0 0

How to simply count number of rows with NAs - R

tl;dr: row wise, you'll want sum(!complete.cases(DF)), or, equivalently, sum(apply(DF, 1, anyNA))

There are a number of different ways to look at the number, proportion or position of NA values in a data frame:

Most of these start with the logical data frame with TRUE for every NA, and FALSE everywhere else. For the base dataset airquality

is.na(airquality)

There are 44 NA values in this data set

sum(is.na(airquality))
# [1] 44

You can look at the total number of NA values per row or column:

head(rowSums(is.na(airquality)))
# [1] 0 0 0 0 2 1
colSums(is.na(airquality))
#   Ozone Solar.R    Wind    Temp   Month     Day 
 37       7       0       0       0       0

You can use anyNA() in place of is.na() as well:

# by row
head(apply(airquality, 1, anyNA))
# [1] FALSE FALSE FALSE FALSE  TRUE  TRUE
sum(apply(airquality, 1, anyNA))
# [1] 42


# by column
head(apply(airquality, 2, anyNA))
#   Ozone Solar.R    Wind    Temp   Month     Day 
#    TRUE    TRUE   FALSE   FALSE   FALSE   FALSE
sum(apply(airquality, 2, anyNA))
# [1] 2

complete.cases() can be used, but only row-wise:

sum(!complete.cases(airquality))
# [1] 42

Count number of NA's in a Row in Specified Columns R

df$na_count <- rowSums(is.na(df[c('first', 'last', 'address', 'phone', 'state')])) 

df
   first m_initial     last         address    phone state customer na_count
1    Bob         L   Turner 123 Turner Lane 410-3141  Iowa     <NA>        0
2   Will         P Williams 456 Williams Rd 491-2359  <NA>        Y        1
3 Amanda         C    Jones    789 Haggerty     <NA>  <NA>        Y        2
4   Lisa      <NA>    Evans            <NA>     <NA>  <NA>        N        3

Python/Pandas: counting the number of missing/NaN in each row

You could first find if element is NaN or not by isnull() and then take row-wise sum(axis=1)

In [195]: df.isnull().sum(axis=1)
Out[195]:
0    0
1    0
2    0
3    3
4    0
5    0
dtype: int64

And, if you want the output as list, you can

In [196]: df.isnull().sum(axis=1).tolist()
Out[196]: [0, 0, 0, 3, 0, 0]

Or use count like

In [130]: df.shape[1] - df.count(axis=1)
Out[130]:
0    0
1    0
2    0
3    3
4    0
5    0
dtype: int64

Count NaN per row with Pandas

IIUC, this should fulfill your needs.

nasum=df['First_Name'].isnull().sum()
df['countNames'] = df.groupby('First_Name')['First_Name'].transform('count').replace(np.nan,nasum)

or, as suggested by ALollz, below code will also provide the same result

df['countNames'] = df.groupby('First_Name')['First_Name'].transform('count').fillna(nasum)

Input

       First_Name   Favorite_Color
0         Jared     Blue
1          Lily     Blue
2         Sarah     Pink
3          Bill     Red
4          Bill     Yellow
5          Alfred   Orange
6          None     Red
7          None     Pink

Output

     First_Name     Favorite_Color  countNames
0         Jared          Blue        1.0
1         Lily           Blue        1.0
2         Sarah          Pink        1.0
3         Bill           Red         2.0
4         Bill           Yellow      2.0
5         Alfred         Orange      1.0
6         None           Red         2.0
7         None           Pink        2.0

Count missing values with rowwise and add number of missing values

You don't need rowwise. Just comment that line and your code works.

This works:

df %>% 
  select(var1, var2) %>% 
  mutate(na = rowSums(is.na(.)))

Count NA in given columns by rows

Another option

NA.counts <- sapply(split(seq(ncol(test)), ceiling(seq(ncol(test))/2))
                    , function(x) rowSums(is.na(test[, x])))

If you want to use tidyverse to add columns you can do

library(tidyverse)
test %>% 
  cbind(NA.counts = map(seq(ncol(test)) %>% split(ceiling(./2))
                        , ~rowSums(is.na(test[, .]))))


#   BIEZ_01 BIEZ_02 BIEZ_03 BIEZ_04 BIEZ_05 BIEZ_06 NA.counts.1 NA.counts.2 NA.counts.3
# 1   59000    5060      NA   22100      NA    4400           0           1           1
# 2   61462   55401   60783   59885   59209    6109           0           0           0
# 3      NA   33000   20000   15000   15000      NA           1           0           1
# 4   33000   33000   20000   15000   15000     500           0           0           0
# 5   30840   30840      NA   20840   20840   10840           0           1           0
# 6   36612   28884   19248   10000      NA   10000           0           0           1

As @Moody_Mudskipper points out, cbind isn't necessary if you want to modify the dataframe. You can add the columns with

test[paste0("SUM",seq(ncol(test)/2))] <- map(seq(ncol(test)) %>% split(ceiling(./2)), 
                                             ~rowSums(is.na(test[.])))

R count number of NA values for each row of a CSV

try this:

result <- data.frame("rowmname"=rownames(df), "missing"=rowSums(is.na(df)))
result

Count Nas Per Row in Dataframe