Conditionally Count in dplyr
Try
library(dplyr)
memberorders %>%
group_by(MemID) %>%
summarise(sum2= sum(value[week<=2]), sum4= sum(value[week <=4]),
count2=sum(week<=2), count4= sum(week<=4))
R: How can I do a conditional count in dplyr?
Could do:
df %>%
group_by(state_name, launch_year) %>%
summarise(
launches = n(),
failed_launches = sum(category == "Failure")
)
How can I count a number of conditional rows within r dplyr mutate?
Here is a dplyr
only solution:
The trick is to substract the grouping number of X (e.g. cumsum(Product=="X")
from the sum of X (e.g. sum(Product=="X")
in each Customer
group:
library(dplyr)
df %>%
arrange(Customer, Date) %>%
group_by(Customer) %>%
mutate(nSubsqX1 = sum(Product=="X") - cumsum(Product=="X"))
Date Customer Product nSubsqX1
<date> <chr> <chr> <int>
1 2020-05-18 A X 0
2 2020-02-10 B X 5
3 2020-02-12 B Y 5
4 2020-03-04 B Z 5
5 2020-03-29 B X 4
6 2020-04-08 B X 3
7 2020-04-30 B X 2
8 2020-05-13 B X 1
9 2020-05-23 B Y 1
10 2020-07-02 B Y 1
11 2020-08-26 B Y 1
12 2020-12-06 B X 0
13 2020-01-31 C X 3
14 2020-09-19 C X 2
15 2020-10-13 C X 1
16 2020-11-11 C X 0
17 2020-12-26 C Y 0
How to conditionally count the number of occurrences using dplyr?
I think the easiest way might be to change the renderTable()
function to the following:
output$stratData <- renderTable({
req(input$stratValues)
req(input$stratPeriod)
filter_exp1 <- parse(text=paste0("Period", "==", "'",input$stratPeriod, "'"))
dat_1 <- reactive({dat() %>% filter(eval(filter_exp1))})
min <- custom_min(dat_1()[[input$stratValues]])
max <- custom_max(dat_1()[[input$stratValues]])
breaks <- if(any(is.infinite(c(min,max)))) c(0, 10) else seq(min, max, length.out = 6)
tmp <- dat() %>%
filter(eval(filter_exp1)) %>%
mutate(Range = cut(!!sym(input$stratValues), breaks=breaks, include.lowest=TRUE, right = TRUE, dig.lab = 5)) %>%
group_by(Range)
if(input$stratValues == "Values_2"){
tmp <- tmp %>%
filter(Flag == "N")
}
tmp <- tmp %>%
summarise(Count = n(),Values = sum(!!sym(input$stratValues))) %>%
complete(Range, fill = list(Count = 0,Values = 0)) %>%
ungroup %>%
mutate(Count_pct = Count/sum(Count)*100, Values_pct = Values/sum(Values)*100) %>%
dplyr::select(everything(), Count, Count_pct, Values, Values_pct) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Total")))
tmp
})
In the code above, there is an if()
condition that identifies whether stratValues
is Values_2
. If so, it filters the data to only include the "N"
observations on Flag
. Then, it continues with the rest of the analysis. This will work if both Values and
Countare calculated only on the observations where
Flag == "N"`.
count frequency by year with dplyr (conditional count)
Here is another tidyverse
method. Simply speaking, we would pivot the dataframe from wide to long and then summarize. Frist summarization gets rid of all the other non-"A"
s. Second summarization condenses the result table into unique bins identified by each toolA
and produces a count
.
library(dplyr)
library(tidyr)
df %>%
mutate(value = +(Tool == "A")) %>%
pivot_wider(names_from = Year, values_fill = 0L) %>%
group_by(ID) %>%
summarize(across(-Tool, sum)) %>%
group_by(toolA = rowSums(across(-ID))) %>%
summarize(count = n(), across(-c(ID, count), sum))
Output
# A tibble: 4 x 5
toolA count `2000` `2001` `2002`
<dbl> <int> <int> <int> <int>
1 0 1 0 0 0
2 1 2 1 0 1
3 2 1 0 1 1
4 3 1 1 1 1
Conditional count with zero with dplyr
You may first summarise
the data to count number of Tool
"A"
in each ID
and then count
the counts.
library(dplyr)
df %>%
group_by(ID) %>%
summarise(ToolA = sum(Tool == "A")) %>%
count(ToolA, name = "count")
# ToolA count
# <int> <int>
#1 0 1
#2 1 3
#3 2 1
Group by and conditionally count
If you want to add it as a column you can do:
DDcomplete %>% group_by(ST) %>% mutate(count = sum(dist.km == 0))
Or if you just want the counts per state:
DDcomplete %>% group_by(ST) %>% summarise(count = sum(dist.km == 0))
Actually, you were very close to the solution. Your code
state= DDcomplete %>%
group_by(ST) %>%
summarize(zero = sum(DDcomplete$dist.km==0, na.rm = TRUE))
is almost correct. You can remove the DDcomplete$
from within the call to sum
because within dplyr chains, you can access variables directly.
Also note that by using summarise
, you will condense your data frame to 1 row per group with only the grouping column(s) and whatever you computed inside the summarise
. If you just want to add a column with the counts, you can use mutate as I did in my answer.
If you're only interested in positive counts, you could also use dplyr's count
function together with filter
to first subset the data:
filter(DDcomplete, dist.km == 0) %>% count(ST)
In R, conditionally count IDs who meet an ANY condition on a certain attribute
You may try
df %>%
group_by(ID) %>%
summarize(treatment = as.numeric(sum(treatment) > 0),
event = as.numeric(sum(event) > 0)) %>%
select(-ID) %>%
count(treatment, event)
treatment event n
<dbl> <dbl> <int>
1 0 0 1
2 0 1 1
3 1 1 2
Related Topics
Is There a Weighted.Median() Function
How to Convert Integer into Categorical Data in R
How to Read CSV File in R Where Some Values Contain the Percent Symbol (%)
Counting the Frequency of an Element in a Data Frame
Modify X-Axis Labels in Each Facet
Perform Multiple Paired T-Tests Based on Groups/Categories
How to Pivot/Unpivot (Cast/Melt) Data Frame
How to Color Sliderbar (Sliderinput)
Function to Split a Matrix into Sub-Matrices in R
Filling Missing Dates in a Grouped Time Series - a Tidyverse-Way
Add Text to Horizontal Barplot in R, Y-Axis at Different Scale
How to Use Multiple Versions of the Same R Package
Is There Anything Wrong with Using T & F Instead of True & False
R Ggplot2 - How to Specify Out of Bounds Values' Colour
Create a Time Interval of 15 Minutes from Minutely Data in R