Conditionally Count in Dplyr

Conditionally Count in dplyr

Try

 library(dplyr)
memberorders %>%
group_by(MemID) %>%
summarise(sum2= sum(value[week<=2]), sum4= sum(value[week <=4]),
count2=sum(week<=2), count4= sum(week<=4))

R: How can I do a conditional count in dplyr?

Could do:

df %>%
group_by(state_name, launch_year) %>%
summarise(
launches = n(),
failed_launches = sum(category == "Failure")
)

How can I count a number of conditional rows within r dplyr mutate?

Here is a dplyr only solution:

The trick is to substract the grouping number of X (e.g. cumsum(Product=="X") from the sum of X (e.g. sum(Product=="X") in each Customer group:

library(dplyr)

df %>%
arrange(Customer, Date) %>%
group_by(Customer) %>%
mutate(nSubsqX1 = sum(Product=="X") - cumsum(Product=="X"))
   Date       Customer Product nSubsqX1
<date> <chr> <chr> <int>
1 2020-05-18 A X 0
2 2020-02-10 B X 5
3 2020-02-12 B Y 5
4 2020-03-04 B Z 5
5 2020-03-29 B X 4
6 2020-04-08 B X 3
7 2020-04-30 B X 2
8 2020-05-13 B X 1
9 2020-05-23 B Y 1
10 2020-07-02 B Y 1
11 2020-08-26 B Y 1
12 2020-12-06 B X 0
13 2020-01-31 C X 3
14 2020-09-19 C X 2
15 2020-10-13 C X 1
16 2020-11-11 C X 0
17 2020-12-26 C Y 0

How to conditionally count the number of occurrences using dplyr?

I think the easiest way might be to change the renderTable() function to the following:

output$stratData <- renderTable({
req(input$stratValues)
req(input$stratPeriod)

filter_exp1 <- parse(text=paste0("Period", "==", "'",input$stratPeriod, "'"))

dat_1 <- reactive({dat() %>% filter(eval(filter_exp1))})
min <- custom_min(dat_1()[[input$stratValues]])
max <- custom_max(dat_1()[[input$stratValues]])
breaks <- if(any(is.infinite(c(min,max)))) c(0, 10) else seq(min, max, length.out = 6)

tmp <- dat() %>%
filter(eval(filter_exp1)) %>%
mutate(Range = cut(!!sym(input$stratValues), breaks=breaks, include.lowest=TRUE, right = TRUE, dig.lab = 5)) %>%
group_by(Range)
if(input$stratValues == "Values_2"){
tmp <- tmp %>%
filter(Flag == "N")
}
tmp <- tmp %>%
summarise(Count = n(),Values = sum(!!sym(input$stratValues))) %>%
complete(Range, fill = list(Count = 0,Values = 0)) %>%
ungroup %>%
mutate(Count_pct = Count/sum(Count)*100, Values_pct = Values/sum(Values)*100) %>%
dplyr::select(everything(), Count, Count_pct, Values, Values_pct) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Total")))
tmp
})

In the code above, there is an if() condition that identifies whether stratValues is Values_2. If so, it filters the data to only include the "N" observations on Flag. Then, it continues with the rest of the analysis. This will work if both Values and Countare calculated only on the observations whereFlag == "N"`.

count frequency by year with dplyr (conditional count)

Here is another tidyverse method. Simply speaking, we would pivot the dataframe from wide to long and then summarize. Frist summarization gets rid of all the other non-"A"s. Second summarization condenses the result table into unique bins identified by each toolA and produces a count.

library(dplyr)
library(tidyr)

df %>%
mutate(value = +(Tool == "A")) %>%
pivot_wider(names_from = Year, values_fill = 0L) %>%
group_by(ID) %>%
summarize(across(-Tool, sum)) %>%
group_by(toolA = rowSums(across(-ID))) %>%
summarize(count = n(), across(-c(ID, count), sum))

Output

# A tibble: 4 x 5
toolA count `2000` `2001` `2002`
<dbl> <int> <int> <int> <int>
1 0 1 0 0 0
2 1 2 1 0 1
3 2 1 0 1 1
4 3 1 1 1 1

Conditional count with zero with dplyr

You may first summarise the data to count number of Tool "A" in each ID and then count the counts.

library(dplyr)

df %>%
group_by(ID) %>%
summarise(ToolA = sum(Tool == "A")) %>%
count(ToolA, name = "count")

# ToolA count
# <int> <int>
#1 0 1
#2 1 3
#3 2 1

Group by and conditionally count

If you want to add it as a column you can do:

DDcomplete %>% group_by(ST) %>% mutate(count = sum(dist.km == 0))

Or if you just want the counts per state:

DDcomplete %>% group_by(ST) %>% summarise(count = sum(dist.km == 0))

Actually, you were very close to the solution. Your code

state= DDcomplete %>%
group_by(ST) %>%
summarize(zero = sum(DDcomplete$dist.km==0, na.rm = TRUE))

is almost correct. You can remove the DDcomplete$ from within the call to sum because within dplyr chains, you can access variables directly.

Also note that by using summarise, you will condense your data frame to 1 row per group with only the grouping column(s) and whatever you computed inside the summarise. If you just want to add a column with the counts, you can use mutate as I did in my answer.


If you're only interested in positive counts, you could also use dplyr's count function together with filter to first subset the data:

filter(DDcomplete, dist.km == 0) %>% count(ST)

In R, conditionally count IDs who meet an ANY condition on a certain attribute

You may try

df %>%
group_by(ID) %>%
summarize(treatment = as.numeric(sum(treatment) > 0),
event = as.numeric(sum(event) > 0)) %>%
select(-ID) %>%
count(treatment, event)

treatment event n
<dbl> <dbl> <int>
1 0 0 1
2 0 1 1
3 1 1 2


Related Topics



Leave a reply



Submit