Count number of values in row using dplyr
Try rowSums
:
> set.seed(1)
> ID <- LETTERS[1:5]
> X1 <- sample(1:5, 5,T)
> X2 <- sample(1:5, 5,T)
> X3 <- sample(1:5, 5,T)
> df <- data.frame(ID,X1,X2,X3)
> df
ID X1 X2 X3
1 A 2 5 2
2 B 2 5 1
3 C 3 4 4
4 D 5 4 2
5 E 2 1 4
> rowSums(df == 2)
[1] 2 1 0 1 1
Alternatively, with dplyr
:
> df %>% mutate(numtwos = rowSums(. == 2))
ID X1 X2 X3 numtwos
1 A 2 5 2 2
2 B 2 5 1 1
3 C 3 4 4 0
4 D 5 4 2 1
5 E 2 1 4 1
Count the number of times a value appears in a column using dplyr
Using the n()
function:
x %>%
group_by(Code) %>%
mutate(Code_frequency = n()) %>%
ungroup()
counting the number of observations row wise using dplyr
Using base R. First line checks all columns, second one checks columns by name, third might not work as good if the number of columns is substantial.
sample$z1 <- rowSums(!is.na(sample))
sample$z2 <- rowSums(!is.na(sample[c("x", "y")]))
sample$z3 <- is.finite(sample$x) + is.finite(sample$y)
> sample
# A tibble: 4 x 5
x y z1 z2 z3
<dbl> <dbl> <dbl> <dbl> <int>
1 1 5 2 2 2
2 2 NA 1 1 1
3 3 2 2 2 2
4 NA NA 0 0 0
Count occurrence of string values per row in dataframe in R (dplyr)
You can use across
with rowSums
-
library(dplyr)
df %>% mutate(d9 = rowSums(across(all_of(cols), `%in%`, bcde)))
# d1 d2 d3 d4 d5 d6 d7 d8 d9
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#1 b a a a a a a a 0
#2 a a a a c a a a 1
#3 a b a a a a a a 1
#4 a a c a a b a a 2
#5 a a a a a a a a 0
#6 a a b a a a a a 1
#7 a a a a a d a a 1
#8 a a a d a a a a 1
This can also be written in base R -
df$d9 <- rowSums(sapply(df[cols], `%in%`, bcde))
How to use R dplyr's summarize to count the number of rows that match a criteria?
You can use sum
on logical vectors - it will automatically convert them into numeric values (TRUE
being equal to 1 and FALSE
being equal to 0), so you need only do:
test %>%
group_by(location) %>%
summarize(total_score = sum(score),
n_outliers = sum(more_than_300))
#> # A tibble: 2 x 3
#> location total_score n_outliers
#> <chr> <dbl> <int>
#> 1 away 927 2
#> 2 home 552 0
Or, if these are your only 3 columns, an equivalent would be:
test %>%
group_by(location) %>%
summarize(across(everything(), sum))
In fact, you don't need to make the more_than_300
column - it would suffice to do:
test %>%
group_by(location) %>%
summarize(total_score = sum(score),
n_outliers = sum(score > 300))
Count number of NA's in a Row in Specified Columns R
df$na_count <- rowSums(is.na(df[c('first', 'last', 'address', 'phone', 'state')]))
df
first m_initial last address phone state customer na_count
1 Bob L Turner 123 Turner Lane 410-3141 Iowa <NA> 0
2 Will P Williams 456 Williams Rd 491-2359 <NA> Y 1
3 Amanda C Jones 789 Haggerty <NA> <NA> Y 2
4 Lisa <NA> Evans <NA> <NA> <NA> N 3
How can I count a number of conditional rows within r dplyr mutate?
Here is a dplyr
only solution:
The trick is to substract the grouping number of X (e.g. cumsum(Product=="X")
from the sum of X (e.g. sum(Product=="X")
in each Customer
group:
library(dplyr)
df %>%
arrange(Customer, Date) %>%
group_by(Customer) %>%
mutate(nSubsqX1 = sum(Product=="X") - cumsum(Product=="X"))
Date Customer Product nSubsqX1
<date> <chr> <chr> <int>
1 2020-05-18 A X 0
2 2020-02-10 B X 5
3 2020-02-12 B Y 5
4 2020-03-04 B Z 5
5 2020-03-29 B X 4
6 2020-04-08 B X 3
7 2020-04-30 B X 2
8 2020-05-13 B X 1
9 2020-05-23 B Y 1
10 2020-07-02 B Y 1
11 2020-08-26 B Y 1
12 2020-12-06 B X 0
13 2020-01-31 C X 3
14 2020-09-19 C X 2
15 2020-10-13 C X 1
16 2020-11-11 C X 0
17 2020-12-26 C Y 0
Related Topics
Existing Function to Combine Standard Deviations in R
How to Install/Locate R.H and Rmath.H Header Files
Margins Between Plots in Grid.Arrange
R - Carry Last Observation Forward N Times
R Plotly: Preserving Appearance of Two Legends When Converting Ggplot2 with Ggplotly
Cannot Install R Tseries, Quadprog ,Xts Packages in Linux
Fastest Way to Parse a Date-Time String to Class Date
How to Format Kable Table When Knit from .Rmd to Word (With Bookdown)
Creating a Cumulative Step Graph in R
Force Ggplot to Evaluate Counter Variable
Combine (Bind) Existing PDF Files in R
How to Extract Variable Names from a Netcdf File in R
Show Source Code for a Function in a Package in R
Control The Fill Order and Groups for a Ggplot2 Geom_Bar
How to Combine Repelling Labels and Shadow or Halo Text in Ggplot2