Dplyr Conditional Summarise Function

Summarize all group values and a conditional subset in the same call

Writing up @hadley's comment as an answer

df_sqlite %>%
  group_by(ID) %>%
  mutate(Bfoo = if(A=="foo") B else 0) %>%
  summarize(sumB = sum(B),
            sumBfoo = sum(Bfoo)) %>%
  collect

Using dplyr summarise with conditions

We could keep the all(Status) as second argument in summarise (or change the column name) and also, it can be done with if/else as the logic seems to return a single TRUE/FALSE based on whether all of the 'Status' is TRUE or not

df %>%
   group_by(ID) %>% 
   summarise( Test = if(all(Status)) first(Price[Status]) else 
                   first(Price[!Status]), Status = all(Status))
# A tibble: 3 x 3
#     ID  Test Status
#   <dbl> <dbl> <lgl> 
#1     1     5 FALSE 
#2     2     0 TRUE  
#3     3     7 FALSE

NOTE: It is better not to use ifelse with unequal lengths for its arguments

dplyr summarise based on order condition with if statement

Here's a dplyr solution:

df %>% 
  group_by(id) %>%
  mutate(ymean = mean(y), zmean = mean(z), 
         pref = 3 * types %in% preference_3rd + 
                2 * types %in% preference_2nd +
                1 * types %in% preference_1st ) %>%
  filter(pref == min(pref)) %>%
  summarise(sumtest = sum(x), ymean = first(ymean), zmean = first(zmean))
#> # A tibble: 5 x 4
#>      id sumtest ymean zmean
#>   <dbl>   <dbl> <dbl> <dbl>
#> 1     1      60   3.5   3.5
#> 2     2      25   8     8  
#> 3     3      40  11.5  11.5
#> 4     4      10  14    14  
#> 5     5      10  15    15

Write function to perform conditional summarize in R using named list

There are two operations done and one of them can be dynamically calculated

library(dplyr)
df %>%
    mutate(total2 = sum(to_summarize[ID1 == filters[['ID1']]])) %>% 
    filter(across(starts_with("ID"), ~ . == 
                filters[[cur_column()]])) %>%
    summarise(total1 = sum(to_summarize),total2 = first(total2))

-output

# A tibble: 1 x 2
  total1 total2
   <dbl>  <dbl>
1     10     12

If we want to do this without filter, then reduce the across output to a single logical vector to subset

library(purrr)
df %>% 
  summarise(total1 = sum(to_summarize[across(starts_with('ID'), 
   ~ . == filters[[cur_column()]]) %>% 
            reduce(`&`)]), 
     total2 = sum(to_summarize[ID1 == filters[['ID1']]]))

-ouptut

# A tibble: 1 x 2
  total1 total2
   <dbl>  <dbl>
1     10     12

group by and conditional summarize in R

We can do a double grouping

library(dplyr)
df %>% 
    group_by(vote) %>% 
    summarise(val=sum(val)) %>%
    group_by(vote = replace(vote, val <2, 'unpop')) %>% 
    summarise(val = sum(val))

-output

# A tibble: 3 x 2
# vote    val
#  <chr> <dbl>
#1 A         3
#2 B         6
#3 unpop     2

Or another option with rowsum

df %>% 
   group_by(vote = replace(vote, vote %in% 
     names(which((rowsum(val, vote) < 2)[,1])), 'unpopular')) %>% 
   summarise(val = sum(val))

Or using fct_lump_n from forcats

library(forcats)
df %>% 
  group_by(vote = fct_lump_n(vote, 2, other_level = "unpop")) %>%
  summarise(val = sum(val))
# A tibble: 3 x 2
#  vote    val
#  <fct> <dbl>
#1 A         3
#2 B         6
#3 unpop     2

Or using table

df %>%
   group_by(vote = replace(vote, 
      vote %in% names(which(table(vote) < 2)), 'unpop'))  %>%
   summarise(val = sum(val))