Data wrangling: Aggregating by group sequentially reducing
We can group_by
id
and create a sequence between current row_number
and total number of rows in each group and concatenate the corresponding value
with toString
.
library(dplyr)
library(tidyr)
df %>%
group_by(id) %>%
mutate(reqd1 = map2_chr(row_number(),n(),~toString(value[.x:.y])))
# id value reqd reqd1
# <fct> <fct> <fct> <chr>
#1 a x x,z,p x, z, p
#2 a z z,p z, p
#3 a p p p
#4 b q q,q q, q
#5 b q q q
#6 c m m,n,x,y m, n, x, y
#7 c n n,x,y n, x, y
#8 c x x,y x, y
#9 c y y y
We can also do this using only base R with ave
with(df, ave(value, id, FUN = function(x)
mapply(function(i, j) toString(x[i:j]), seq_along(x), length(x))))
#[1] "x, z, p" "z, p" "p" "q, q" "q" "m, n, x, y" "n, x, y" "x, y" "y"
dplyr grouped cumulative set counting using group_by and rowwise do
You can use the Reduce
function with the accumulate
mode to create cumulatively distinct elements and then use lengths
function to return the cumulative distinct counts, this avoids the rowwise()
operation:
library(dplyr)
testdf %>%
arrange(desc(order)) %>%
group_by(id) %>%
mutate(cc = lengths(Reduce(function(x, y) unique(c(x, y)), content, acc = T))) %>%
arrange(id)
#Source: local data frame [6 x 4]
#Groups: id [2]
# id order content cc
# <fctr> <dbl> <list> <int>
#1 a 7 <chr [3]> 3
#2 a 5 <chr [2]> 3
#3 a 3 <chr [2]> 5
#4 b 9 <chr [2]> 2
#5 b 4 <chr [3]> 3
#6 b 1 <chr [2]> 3
Equivalent to cumsum for string in R
(df$B <- Reduce(paste, as.character(df$A), accumulate = TRUE))
# [1] "banana" "banana boats" "banana boats are" "banana boats are awesome"
Related Topics
Difference Between the == and %In% Operators in R
Ggplot2 Keep Unused Levels Barplot
How to Open CSV File in R When R Says "No Such File or Directory"
How to Calculate Mean/Median Per Group in a Dataframe in R
Aggregate a Data Frame Based on Unordered Pairs of Columns
How to Send an Email With Attachment from R in Windows
Check If the Number Is Integer
How to Make Consistent-Width Plots in Ggplot (With Legends)
How to Get a Vertical Geom_Vline to an X-Axis of Class Date
Change Bar Plot Colour in Geom_Bar With Ggplot2 in R
Extract Month and Year from a Zoo::Yearmon Object
Plot With Conditional Colors Based on Values in R
Dplyr Filter: Get Rows With Minimum of Variable, But Only the First If Multiple Minima
How to Display Only Integer Values on an Axis Using Ggplot2
What Is Meaning of First Tilde in Purrr::Map
R Ifelse to Replace Values in a Column
Compute Mean and Standard Deviation by Group For Multiple Variables in a Data.Frame