Count How Many Values in Some Cells of a Row Are Not Na (In R)

Count how many values in some cells of a row are not NA (in R)

You can use is.na() over the selected columns, then rowSums() the result:

library(stringr)
df <- data_frame(
id = 1:10
, name = fruit[1:10]
, word1 = c(words[1:5],NA,words[7:10])
, word2 = words[11:20]
, word3 = c(NA,NA,NA,words[25],NA,NA,words[32],NA,NA,words[65]))

df$word_count <- rowSums( !is.na( df [,3:5]))

df
id name word1 word2 word3 n_words
<int> <chr> <chr> <chr> <chr> <dbl>
1 1 apple a actual <NA> 2
2 2 apricot able add <NA> 2
3 3 avocado about address <NA> 2
4 4 banana absolute admit agree 3
5 5 bell pepper accept advertise <NA> 2
6 6 bilberry <NA> affect <NA> 1
7 7 blackberry achieve afford alright 3
8 8 blackcurrant across after <NA> 2
9 9 blood orange act afternoon <NA> 2
10 10 blueberry active again awful 3

Edit

Using dplyr you could do this:

df %>% 
select(3:5) %>%
is.na %>%
`!` %>%
rowSums

Count the number of non-NA numeric values of each row in dplyr

Use select + is.na + rowSums, select(., -id) returns the original data frame (.) with id excluded, and then count number of non-NA values with rowSums(!is.na(...)):

df %>% mutate(var4 = rowSums(!is.na(select(., -id))))

# id var1 var2 var3 var4
#1 1 10 NA 4 2
#2 2 11 1 NA 2
#3 3 12 2 5 3
#4 4 13 2 NA 2
#5 5 14 1 NA 2
#6 6 15 1 NA 2
#7 7 16 1 5 3
#8 8 17 NA 4 2
#9 9 18 NA 4 2
#10 10 19 NA NA 1

How to 'count' number of non-empty values in a single row across multiple columns in a dataframe

If you are talking about missing values in R, it's represented in capital letter NA instead of na, otherwise, R will treat it as a string, which is not empty.

Also, I have artificially included some Name in your df to act like each row represents one Name, and a artificial Comp5 which includes some NAs but will not be included in the calculation.

rowSums() as its name suggests, calculates the sum of the row.

is.na(df[, 2:4]) makes it only counts the NA in df from column 2 to column 4.

df <-read.table(header = T, 
text =
"Name Comp1 Comp2 Comp3 Comp4 Comp5
A 0.5 0.4 NA 0.6 NA
B 0.6 NA NA 0.7 1
C NA 0.4 NA 1.1 NA")

df$Count_NA <- rowSums(is.na(df[, 2:4]))

Output

  Name Comp1 Comp2 Comp3 Comp4 Comp5 Count_NA
1 A 0.5 0.4 NA 0.6 NA 1
2 B 0.6 NA NA 0.7 1 2
3 C NA 0.4 NA 1.1 NA 2

Count number of non-NA values for every column in a dataframe

You can also call is.na on the entire data frame (implicitly coercing to a logical matrix) and call colSums on the inverted response:

# make sample data
set.seed(47)
df <- as.data.frame(matrix(sample(c(0:1, NA), 100*5, TRUE), 100))

str(df)
#> 'data.frame': 100 obs. of 5 variables:
#> $ V1: int NA 1 NA NA 1 NA 1 1 1 NA ...
#> $ V2: int NA NA NA 1 NA 1 0 1 0 NA ...
#> $ V3: int 1 1 0 1 1 NA NA 1 NA NA ...
#> $ V4: int NA 0 NA 0 0 NA 1 1 NA NA ...
#> $ V5: int NA NA NA 0 0 0 0 0 NA NA ...

colSums(!is.na(df))
#> V1 V2 V3 V4 V5
#> 69 55 62 60 70

Count non-NA values by group

You can use this

mydf %>% group_by(col_1) %>% summarise(non_na_count = sum(!is.na(col_2)))

# A tibble: 2 x 2
col_1 non_na_count
<fctr> <int>
1 A 1
2 B 2

Count number of NA's in a Row in Specified Columns R


df$na_count <- rowSums(is.na(df[c('first', 'last', 'address', 'phone', 'state')])) 

df
first m_initial last address phone state customer na_count
1 Bob L Turner 123 Turner Lane 410-3141 Iowa <NA> 0
2 Will P Williams 456 Williams Rd 491-2359 <NA> Y 1
3 Amanda C Jones 789 Haggerty <NA> <NA> Y 2
4 Lisa <NA> Evans <NA> <NA> <NA> N 3

Create new column based on counting non-NA values across multiple columns


df$column_non_NA= rowSums(!is.na(df[-1]))
df
Q1 Q1a Q1b Q1c column_non_NA
1 Yes AAA BBB <NA> 2
2 No <NA> <NA> <NA> 0
3 Yes AAA <NA> <NA> 1
4 No <NA> <NA> <NA> 0
5 Yes ABC BCD EFG 3
6 Yes DDD <NA> <NA> 1
7 Yes EEE AAA AAA 3


Related Topics



Leave a reply



Submit