Simple Method of Counting Non-Nas in Column of Data String

Simple method of counting non-NAs in column of data String

For a data.frame you can get it using colSums and is.na:

set.seed(45)
df <- data.frame(matrix(sample(c(NA,1:5), 50, replace=TRUE), ncol=5))
#    X1 X2 X3 X4 X5
# 1   3  2 NA  2 NA
# 2   1  5  1  1  4
# 3   1  1  3  2  3
# 4   2  2  3  5  3
# 5   2  2  5  2  2
# 6   1  2 NA  3  3
# 7   1  5  5  5  2
# 8   3 NA  4  1  5
# 9   1  2  3 NA  1
# 10 NA  1  1  2  2

colSums(!is.na(df))
# X1 X2 X3 X4 X5 
#  9  9  8  9  9

Count number of non-NA values for every column in a dataframe

You can also call is.na on the entire data frame (implicitly coercing to a logical matrix) and call colSums on the inverted response:

# make sample data
set.seed(47)
df <- as.data.frame(matrix(sample(c(0:1, NA), 100*5, TRUE), 100))

str(df)
#> 'data.frame':    100 obs. of  5 variables:
#>  $ V1: int  NA 1 NA NA 1 NA 1 1 1 NA ...
#>  $ V2: int  NA NA NA 1 NA 1 0 1 0 NA ...
#>  $ V3: int  1 1 0 1 1 NA NA 1 NA NA ...
#>  $ V4: int  NA 0 NA 0 0 NA 1 1 NA NA ...
#>  $ V5: int  NA NA NA 0 0 0 0 0 NA NA ...

colSums(!is.na(df))
#> V1 V2 V3 V4 V5 
#> 69 55 62 60 70

Counting non NAs in a data frame; getting answer as a vector

Try this:

# define "demo" dataset
ZZZ <- data.frame(n=c(1,2,NA),m=c(6,NA,NA),o=c(7,8,8))
# apply the counting function per columns
apply(ZZZ, 2, function(x) length(which(!is.na(x))))

Having run:

> apply(ZZZ, 2, function(x) length(which(!is.na(x))))
n m o 
2 1 3

If you really insist on returning a vector, you might use as.vector, e.g. by defining this function:

nonNAs <- function(x) {
    as.vector(apply(x, 2, function(x) length(which(!is.na(x)))))
    }

You could simply run nonNAs(ZZZ):

> nonNAs(ZZZ)
[1] 2 1 3

Count non-NA values by group

You can use this

mydf %>% group_by(col_1) %>% summarise(non_na_count = sum(!is.na(col_2)))

# A tibble: 2 x 2
   col_1 non_na_count
  <fctr>        <int>
1      A            1
2      B            2

From an R dataframe: count non-NA values by column, grouped by one of the columns

We can use summarise_all

library(dplyr)
litmus %>% 
   group_by(grouping) %>% 
   summarise_all(funs(sum(!is.na(.))))

Efficient way to calculate non-na rows vs NA rows in a column

No need for particular function, base R you can simply do:

colSums(is.na(df))/colSums(!is.na(df))
#  a   b   c 
#2.0 0.5 Inf

For a particular set of columns:

colSums(is.na(df))/colSums(!is.na(df))  # works also with one value aka 'a'

Data:

 df = data.frame(a=c(NA,NA,4),b=c(NA,1,2),c=c(NA,NA,NA))

Create new column based on counting non-NA values across multiple columns

df$column_non_NA= rowSums(!is.na(df[-1]))
df
   Q1  Q1a  Q1b  Q1c column_non_NA
1 Yes  AAA  BBB <NA>             2
2  No <NA> <NA> <NA>             0
3 Yes  AAA <NA> <NA>             1
4  No <NA> <NA> <NA>             0
5 Yes  ABC  BCD  EFG             3
6 Yes  DDD <NA> <NA>             1
7 Yes  EEE  AAA  AAA             3

Count number of non-NA values by group

Or if you wanted to use data.table:

library(data.table)

dt[,sum(!is.na(X2)),by=.(Color)]

  Color V1
1:   Red  2
2:  Blue  0
3: Green  1

Also its easy enough to use an ifelse() in your data.table to get an NA for blue instead of 0. See:

dt[,ifelse(sum(!is.na(X2)==0),as.integer(NA),sum(!is.na(X2))),by=.(Color)]

   Color V1
1:   Red  2
2:  Blue NA
3: Green  1

Data:

 dt <- as.data.table(fread("Color    X1      X2    X3    X4
Red      1       1     0     2
Blue     0       NA    4     1 
Red      3       4     3     1
Green    2       2     1     0"))

Simple Method of Counting Non-Nas in Column of Data String