dplyr count number of one specific value of variable
Try the following instead:
library(dplyr)
dat %>% group_by(id) %>%
summarise(cip.completed= sum(code == "a"))
Source: local data frame [3 x 2]
id cip.completed
(dbl) (int)
1 1 1
2 2 2
3 3 0
This works because the logical condition code == a
is just a series of zeros and ones, and the sum of this series is the number of occurences.
Note that you would not necessarily use dplyr::count
inside summarise
anyway, as it is a wrapper for summarise
calling either n()
or sum()
itself. See ?dplyr::count
. If you really want to use count
, I guess you could do that by first filtering the dataset to only retain all rows in which code==a
, and using count
would then give you all strictly positive (i.e. non-zero) counts. For instance,
dat %>% filter(code==a) %>% count(id)
Source: local data frame [2 x 2]
id n
(dbl) (int)
1 1 1
2 2 2
Count the number of times a value appears in a column using dplyr
Using the n()
function:
x %>%
group_by(Code) %>%
mutate(Code_frequency = n()) %>%
ungroup()
Count the number of times a value changes in a variable for specific values
If we are looking for the adjacent element that matches, then get the lag
and do a count
, and reshape to 'wide' with pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(P2 = lag(P1)) %>%
filter(P1 %in% c(4, 5)) %>%
count(P1 = str_c("Category_", P1), P2) %>%
pivot_wider(names_from = P1, values_from = n, values_fill = 0)
-output
# A tibble: 3 × 3
P2 Category_4 Category_5
<dbl> <int> <int>
1 1 3 3
2 2 3 2
3 3 0 1
How to count the 'number of values of a particular variable of a tibble' in an interval in R?
Below, I use cut
to bin the data and then table
to count instances of each bin.
data.frame(cut(q4$prop, breaks = c(0, 0.1, 0.2, 0.3)) %>% table)
produces
# . Freq
# 1 (0,0.1] 341
# 2 (0.1,0.2] 13
# 3 (0.2,0.3] 2
Summary code that counts the number of values less than specified numbers
We can use sapply
v1 <- c(5, 10, 20, 30, 60)
out <- sapply(v1, function(x) sum(summary$S2S_Mins < x, na.rm = TRUE))
names(out) <- paste0("Less", v1)
stack(out)[2:1]
-ouput
ind values
1 Less5 0
2 Less10 0
3 Less20 7
4 Less30 19
5 Less60 43
Count the values of a column based on the values of another column in R
Does this solve your problem?
Zip<-c("123245","12345","123245","123456","123456","12345")
Name<-c("Bob","Bob","Bob","Jack","Jack","Mary")
df<-data.frame(Zip,Name)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df %>%
group_by(Zip, Name) %>%
summarise(Count = n())
#> `summarise()` has grouped output by 'Zip'. You can override using the `.groups` argument.
#> # A tibble: 4 × 3
#> # Groups: Zip [3]
#> Zip Name Count
#> <chr> <chr> <int>
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 12345 Mary 1
#> 4 123456 Jack 2
Created on 2021-12-22 by the reprex package (v2.0.1)
--
Quick speed benchmark:
library(tidyverse)
library(microbenchmark)
Zip<-c("123245","12345","123245","123456","123456","12345")
Name<-c("Bob","Bob","Bob","Jack","Jack","Mary")
df<-data.frame(Zip,Name)
JM <- function(df){
df %>%
group_by(Zip, Name) %>%
summarise(Count = n())
}
JM(df)
#> `summarise()` has grouped output by 'Zip'. You can override using the `.groups` argument.
#> # A tibble: 4 × 3
#> # Groups: Zip [3]
#> Zip Name Count
#> <chr> <chr> <int>
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 12345 Mary 1
#> 4 123456 Jack 2
TarJae <- function(df){
df %>%
count(Zip, Name, name= "Count")
}
TIC <- function(df){
aggregate(cbind(Count = Zip) ~ Zip + Name, df, length)
}
TIC(df)
#> Zip Name Count
#> 1 123245 Bob 2
#> 2 12345 Bob 1
#> 3 123456 Jack 2
#> 4 12345 Mary 1
res <- microbenchmark(JM(df), TIC(df), TarJae(df))
autoplot(res)
#> Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Created on 2021-12-22 by the reprex package (v2.0.1)
How to count the occurrences of a specific value in a single column in R
Based on the example showed, there is no need for substring matching. We can directly use fixed match with ==
and get the count with sum
of logical vector
n_male <- sum(tbl1_df$sex == 'male')
n_male
#[1] 2
With grep
, it can also match substring. There is male
in fe
male.
If we want to use grep
, specify a word boundary
n_male <- sum(grepl('\\bmale\\b', tbl1_df$sex))
Or with length/grep
n_male <- length(grep('\\bmale\\b', tbl1_df$sex))
Also, in base R
, we can just use table
table(tbl1_df$sex)
to get the counts of both
Count certain values of column in group id
You could use
library(dplyr)
df %>%
group_by(id) %>%
mutate(number_A = sum(value == "A", na.rm = TRUE)) %>%
ungroup()
This returns
# A tibble: 23 x 3
id value number_A
<dbl> <chr> <int>
1 1 NA 2
2 1 NA 2
3 1 A 2
4 1 A 2
5 2 NA 3
6 2 NA 3
7 2 A 3
8 2 A 3
9 2 B 3
10 2 A 3
# ... with 13 more rows
Related Topics
How to Call the 'Function' Function
Change Paper Size and Orientation in an Rmarkdown PDF
Using Strsplit and Subset in Dplyr and Mutate
Calculate Mean by Group Using Dplyr Package
Logistic Regression with Robust Clustered Standard Errors in R
R - How to Add Row Index to a Data Frame, Based on Combination of Factors
Formatting Histogram X-Axis When Working with Dates Using R
Associate a Color Palette with Ggplot2 Theme
Check If R Package Is Installed Then Load Library
How to Pass Pandoc_Args to Yaml Header in Rmarkdown
How Does Branch Prediction Affect Performance in R
Rank Per Row Over Multiple Columns in R
Multiple Condition If-Else Using Dplyr, Custom Function, or Purrr
Ggplot2 Aes_String() Fails to Handle Names Starting with Numbers or Containing Spaces
Divide Each Data Frame Row by Vector in R
Provide Shades Between Dates on X Axis
How to Create a List in R from Two Vectors (One Would Be the Keys, the Other the Values)