Dplyr Summarise_Each with Na.Rm

dplyr summarise_each with na.rm

Following the links in the doc, it seems you can use funs(mean(., na.rm = TRUE)):

library(dplyr)
by_species <- iris %>% group_by(Species)
by_species %>% summarise_each(funs(mean(., na.rm = TRUE)))

Saving na.rm=TRUE for each function in dplyr

You should use summarise_at, which lets you compute multiple functions for the supplied columns and set arguments that are shared among them:

df %>% group_by(group) %>% 
  summarise_at("value", 
               funs(mean = mean, sd = sd, min = min), 
               na.rm = TRUE)

How to Use na.rm=TRUE with n() While Using Dplyr's Group_by and Summarise_at

I think your code was very close to getting the job done. I made some slight changes and have included an example of how you might include the percent calculation in the same step (although I am not sure of your expected output).

library(dplyr)
Df %>% 
  group_by(Group) %>% 
  summarise_all(funs(count = sum(!is.na(.)), 
                     sum = sum(.,na.rm=TRUE),
                     pct = sum(.,na.rm=TRUE)/sum(!is.na(.))))

#> # A tibble: 2 x 10
#>    Group Var1_count Var2_count Var3_count Var1_sum Var2_sum Var3_sum
#>   <fctr>      <int>      <int>      <int>    <dbl>    <dbl>    <dbl>
#> 1  Condo          2          2          2        1        2        1
#> 2  House          5          6          4        4        5        4
#> # ... with 3 more variables: Var1_pct <dbl>, Var2_pct <dbl>,
#> #   Var3_pct <dbl>

I've also used summarise_all instead of summarise_at as summarise_all works on all the variables which aren't group variables.

Problem using na.rm=TRUE in summarize in R code

If we want to find the mode, use Mode

Mode <- function(x) {
   ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
 }

and now it should work

Test%>%
   group_by(Week = tools::toTitleCase(Week)) %>% 
   summarize(Mode=Mode(time),.groups = 'drop')
# A tibble: 2 × 2
  Week       Mode
  <chr>     <dbl>
1 Thursday      0
2 Wednesday     5

If we want to insert the na.rm, it should be an argument to the function and the max should also have that argument

Test1 <- function(t, rm_na) {
  s <- table(as.vector(t))
  names(s)[s %in% max(s, na.rm = rm_na)]
   }

and use the function as

Test %>%
    group_by(Week = tools::toTitleCase(Week)) %>%   
    summarize(Mode=Test1(time, TRUE),.groups = 'drop')

How can I in R, group by ID and summarise by mean with na.rm = TRUE

Use the lambda (~

library(dplyr)
ID_x %>%
  group_by(ID) %>% 
  summarise_each(~ mean(., na.rm=TRUE))

-output

# A tibble: 3 × 2
     ID     x
  <dbl> <dbl>
1     1   2.5
2     2   2.5
3     3   1

Also, in recent versions, the summarise_each will accompany a warning as these are deprecated in favor of across

ID_x %>%
  group_by(ID) %>% 
  summarise(across(everything(), ~ mean(., na.rm=TRUE)))

Remove NAs in function list for dplyr's across

One option could be:

iris %>%
 group_by(Species) %>% 
 summarise(across(c(Sepal.Length:Petal.Width), 
                  list(mean = ~ mean(., na.rm = TRUE), sd = ~ sd(., na.rm = TRUE))))

Summarise_each for first non-NA value

You can use first(na.omit(.)) or na.omit(.)[1]. Besides summarise_each is deprecated, use summarise_all instead.

Using dplyr summarise_each() with is.na()

Here's a possibility, tested on a small data set with some NA:

df <- data.frame(a = rep(1:2, each = 3),
                 b = c(1, 1, NA, 1, NA, NA),
                 c = c(1, 1, 1, NA, NA, NA))

df
#   a  b  c
# 1 1  1  1
# 2 1  1  1
# 3 1 NA  1
# 4 2  1 NA
# 5 2 NA NA
# 6 2 NA NA

df %>% 
  group_by(a) %>%
  summarise_each(funs(sum(is.na(.)) / length(.)))
#   a         b c
# 1 1 0.3333333 0
# 2 2 0.6666667 1

And because you asked for pointers to documentation: The . refers to each piece of the data, and is used in some Examples in ?summarize_each. It is described in the Arguments section of ?funs as a "dummy parameter" , and is used the Examples. The . is also briefly described in the Arguments section of ?do: "... You can use . to refer to the current group"

Dplyr Summarise_Each with Na.Rm