Add a column with count of NAs and Mean
library(dplyr)
count_na <- function(x) sum(is.na(x))
df1 %>%
mutate(means = rowMeans(., na.rm = T),
count_na = apply(., 1, count_na))
#### ANSWER FOR RADEK ####
elected_cols <- c('b', 'c')
df1 %>%
mutate(means = rowMeans(.[elected_cols], na.rm = T),
count_na = apply(.[elected_cols], 1, count_na))
Count NAs per row in dataframe
You could add a new column to your data frame containing the number of NA
values per batch_id
:
df$na_count <- apply(df, 1, function(x) sum(is.na(x)))
Count the number of NAs in multiple columns after grouping a dataframe in R
I propose two ways:
using dplyr:
df %>%
group_by(Region,ID) %>%
summarise_each(list(na_count = ~sum(is.na(.))))
or data.table:
library(data.table)
setDT(df)[, lapply(.SD, function(x) sum(is.na(x))), by = .(Region, ID)]
R: how to total the number of NA in each col of data.frame
You could try:
colSums(is.na(df))
# V1 V2 V3 V4 V5
# 2 4 2 4 4
data
set.seed(42)
df <- as.data.frame(matrix(sample(c(NA,0:4), 5*20,replace=TRUE), ncol=5))
How to fill mean for NAs in column by groups in r?
You can use :
library(dplyr)
df %>%
group_by(Category) %>%
mutate(across(starts_with('column'),
~replace(., is.na(.), mean(., na.rm = TRUE)))) %>%
ungroup
# PID Category column1 column2 column3
# <int> <int> <dbl> <dbl> <dbl>
# 1 123 1 54 2.4 23.4
# 2 324 1 52 3 21.1
# 3 356 1 53 3.6 25.6
# 4 378 2 56 3.2 27.1
# 5 395 2 50.5 3.5 29.9
# 6 362 2 45 3.35 24.3
# 7 789 3 65 12.6 23.8
# 8 759 3 66 10.6 26.8
# 9 762 3 66.7 10.6 27.2
#10 741 3 69 8.5 23.3
row wise NA count across some columns - grouped by id
library(tidyverse)
threshold = 10
df %>% group_by(id) %>%
mutate(evidence = ifelse(n()*5 - sum(na_count) >= threshold, "yes", "no"))
The 5 comes from the number of columns you have, q1:q5.
Related Topics
Protect/Encrypt R Package Code for Distribution
What Does "Error: Object '<Myvariable>' Not Found" Mean
Use a Variable Within a Plotmath Expression
Make Readline Wait for Input in R
Create a Time Interval of 15 Minutes from Minutely Data in R
Plots Generated by 'Plot' and 'Ggplot' Side-By-Side
Multiple Time Series in One Plot
How to Get Top N Companies from a Data Frame in Decreasing Order
Call by Reference in R (Using Function to Modify an Object)
Finding the Index Inside a Vector Satisfying a Condition
Why Doesn't Outer Work the Way I Think It Should (In R)
How to Align the Bars of a Histogram with the X Axis
Reverse Datetime (Posixct Data) Axis in Ggplot
Assign Value to Group Based on Condition in Column
Is There a _Fast_ Way to Run a Rolling Regression Inside Data.Table
Update Handsontable by Editing Table And/Or Eventreactive