How many non-NA values in each row for a matrix?
Most Simply:
rowSums(!is.na(x))
(thanks to @Khashaa for this code).
Note the use of !
which equates to "not". This means that !is.na(x)
is evaluating the statement "values that are not equal to "NA".
Alternatively:
To return not NA you can change the code as follows:
sum(is.na(x)==FALSE)
You can modify the code using apply
to apply the code over the matrix as follows:
apply(d,2,function(x) sum(is.na(x))==TRUE))
where d
is a matrix such as:
d=matrix(c(1,NA,NA,NA),ncol=2,nrow=2)
Count number of non-NA values for every column in a dataframe
You can also call is.na
on the entire data frame (implicitly coercing to a logical matrix) and call colSums
on the inverted response:
# make sample data
set.seed(47)
df <- as.data.frame(matrix(sample(c(0:1, NA), 100*5, TRUE), 100))
str(df)
#> 'data.frame': 100 obs. of 5 variables:
#> $ V1: int NA 1 NA NA 1 NA 1 1 1 NA ...
#> $ V2: int NA NA NA 1 NA 1 0 1 0 NA ...
#> $ V3: int 1 1 0 1 1 NA NA 1 NA NA ...
#> $ V4: int NA 0 NA 0 0 NA 1 1 NA NA ...
#> $ V5: int NA NA NA 0 0 0 0 0 NA NA ...
colSums(!is.na(df))
#> V1 V2 V3 V4 V5
#> 69 55 62 60 70
Select set of columns so that each row has at least one non-NA entry
Using a while
loop, this should work to get the minimum set of variables with at least one non-NA per row.
best <- function(df){
best <- which.max(colSums(sapply(df, complete.cases)))
while(any(rowSums(sapply(df[best], complete.cases)) == 0)){
best <- c(best, which.max(sapply(df[is.na(df[best]), ], \(x) sum(complete.cases(x)))))
}
best
}
testing
best(df)
#d c
#4 3
df[best(df)]
# d c
#1 1 1
#2 1 NA
#3 1 NA
#4 1 NA
#5 NA 1
First, select the column with the least NAs (stored in best
). Then, update the vector with the column that has the highest number of non-NA rows on the remaining rows (where best has still NAs), until you get every rows with a complete case.
How to identify which columns are not “NA” per row in a dataframe?
Loop through rows, get column names with for non-na columns, then paste:
d$myCol <- apply(d, 1, function(i) paste(colnames(d)[ !is.na(i) ], collapse = ","))
Sum the last n non NA values in each column of a matrix in R
You can use apply
with tail
to sum up the last non NA
like:
apply(x, 2, function(x) sum(tail(x[!is.na(x)], 3)))
#x1 x2 x3 x4 x5
#15 11 9 6 3
R: function or similar to sum up number of non-NA values for columns that contain specific characters in large data set
Use rowSums
:
library(dplyr)
p %>% mutate(n_fu = rowSums(!is.na(select(., contains('fu_location')))))
Or in base :
p$n_fu <- rowSums(!is.na(p[grep('fu_location', names(p))]))
From an R dataframe: count non-NA values by column, grouped by one of the columns
We can use summarise_all
library(dplyr)
litmus %>%
group_by(grouping) %>%
summarise_all(funs(sum(!is.na(.))))
Related Topics
Is R Superstitious Regarding Posixct Data Type
Remove All Variables Except Functions
Assign Headers Based on Existing Row in Dataframe in R
Ggplot2 - Shade Area Above Line
How to Plot a Classification Graph of a Svm in R
R: Count Unique Values by Category
Library/Package Development - Message When Loading
How to Get My Blogdown Blog on R-Bloggers
R: How to Get the Last Element from Each Group
What Techniques Exists in R to Visualize a "Distance Matrix"
How to Apply Function Over Each Matrix Element's Indices
How to Save a Data Frame as CSV to a User Selected Location Using Tcltk
How to Tell What Packages You Have Used in R