Dplyr Count Number of One Specific Value of Variable

dplyr count number of one specific value of variable

Try the following instead:

library(dplyr)
dat %>% group_by(id) %>%
summarise(cip.completed= sum(code == "a"))

Source: local data frame [3 x 2]
id cip.completed
(dbl) (int)
1 1 1
2 2 2
3 3 0

This works because the logical condition code == a is just a series of zeros and ones, and the sum of this series is the number of occurences.

Note that you would not necessarily use dplyr::count inside summarise anyway, as it is a wrapper for summarise calling either n() or sum() itself. See ?dplyr::count. If you really want to use count, I guess you could do that by first filtering the dataset to only retain all rows in which code==a, and using count would then give you all strictly positive (i.e. non-zero) counts. For instance,

dat %>% filter(code==a) %>% count(id)

Source: local data frame [2 x 2]

id n
(dbl) (int)
1 1 1
2 2 2

Count the number of times a value appears in a column using dplyr

Using the n() function:

x %>%
group_by(Code) %>%
mutate(Code_frequency = n()) %>%
ungroup()

Count the number of times a value changes in a variable for specific values

If we are looking for the adjacent element that matches, then get the lag and do a count, and reshape to 'wide' with pivot_wider

library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(P2 = lag(P1)) %>%
filter(P1 %in% c(4, 5)) %>%
count(P1 = str_c("Category_", P1), P2) %>%
pivot_wider(names_from = P1, values_from = n, values_fill = 0)

-output

# A tibble: 3 × 3
P2 Category_4 Category_5
<dbl> <int> <int>
1 1 3 3
2 2 3 2
3 3 0 1

How to count the 'number of values of a particular variable of a tibble' in an interval in R?

Below, I use cut to bin the data and then table to count instances of each bin.

data.frame(cut(q4$prop, breaks = c(0, 0.1, 0.2, 0.3)) %>% table)

produces

#           . Freq
# 1 (0,0.1] 341
# 2 (0.1,0.2] 13
# 3 (0.2,0.3] 2

Summary code that counts the number of values less than specified numbers

We can use sapply

v1 <- c(5, 10, 20, 30, 60)
out <- sapply(v1, function(x) sum(summary$S2S_Mins < x, na.rm = TRUE))
names(out) <- paste0("Less", v1)
stack(out)[2:1]

-ouput

     ind values
1 Less5 0
2 Less10 0
3 Less20 7
4 Less30 19
5 Less60 43

Count the values of a column based on the values of another column in R

Does this solve your problem?

Zip<-c("123245","12345","123245","123456","123456","12345")
Name<-c("Bob","Bob","Bob","Jack","Jack","Mary")
df<-data.frame(Zip,Name)

library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df %>%
group_by(Zip, Name) %>%
summarise(Count = n())
#> `summarise()` has grouped output by 'Zip'. You can override using the `.groups` argument.
#> # A tibble: 4 × 3
#> # Groups: Zip [3]
#> Zip Name Count
#> <chr> <chr> <int>
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 12345 Mary 1
#> 4 123456 Jack 2

Created on 2021-12-22 by the reprex package (v2.0.1)

--

Quick speed benchmark:

library(tidyverse)
library(microbenchmark)

Zip<-c("123245","12345","123245","123456","123456","12345")
Name<-c("Bob","Bob","Bob","Jack","Jack","Mary")
df<-data.frame(Zip,Name)

JM <- function(df){
df %>%
group_by(Zip, Name) %>%
summarise(Count = n())
}
JM(df)
#> `summarise()` has grouped output by 'Zip'. You can override using the `.groups` argument.
#> # A tibble: 4 × 3
#> # Groups: Zip [3]
#> Zip Name Count
#> <chr> <chr> <int>
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 12345 Mary 1
#> 4 123456 Jack 2

TarJae <- function(df){
df %>%
count(Zip, Name, name= "Count")
}

TIC <- function(df){
aggregate(cbind(Count = Zip) ~ Zip + Name, df, length)
}
TIC(df)
#> Zip Name Count
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 123456 Jack 2
#> 4 12345 Mary 1

res <- microbenchmark(JM(df), TIC(df), TarJae(df))
autoplot(res)
#> Coordinate system already present. Adding new coordinate system, which will replace the existing one.

Sample Image

Created on 2021-12-22 by the reprex package (v2.0.1)

How to count the occurrences of a specific value in a single column in R

Based on the example showed, there is no need for substring matching. We can directly use fixed match with == and get the count with sum of logical vector

n_male <- sum(tbl1_df$sex == 'male')
n_male
#[1] 2

With grep, it can also match substring. There is male in female.

If we want to use grep, specify a word boundary

n_male <- sum(grepl('\\bmale\\b', tbl1_df$sex))

Or with length/grep

n_male <- length(grep('\\bmale\\b', tbl1_df$sex))

Also, in base R, we can just use table

table(tbl1_df$sex)

to get the counts of both

Count certain values of column in group id

You could use

library(dplyr)

df %>%
group_by(id) %>%
mutate(number_A = sum(value == "A", na.rm = TRUE)) %>%
ungroup()

This returns

# A tibble: 23 x 3
id value number_A
<dbl> <chr> <int>
1 1 NA 2
2 1 NA 2
3 1 A 2
4 1 A 2
5 2 NA 3
6 2 NA 3
7 2 A 3
8 2 A 3
9 2 B 3
10 2 A 3
# ... with 13 more rows


Related Topics



Leave a reply



Submit