Cumulatively Paste (Concatenate) Values Grouped by Another Variable

Data wrangling: Aggregating by group sequentially reducing

We can group_by id and create a sequence between current row_number and total number of rows in each group and concatenate the corresponding value with toString.

library(dplyr)
library(tidyr)

df %>%
group_by(id) %>%
mutate(reqd1 = map2_chr(row_number(),n(),~toString(value[.x:.y])))

# id value reqd reqd1
# <fct> <fct> <fct> <chr>
#1 a x x,z,p x, z, p
#2 a z z,p z, p
#3 a p p p
#4 b q q,q q, q
#5 b q q q
#6 c m m,n,x,y m, n, x, y
#7 c n n,x,y n, x, y
#8 c x x,y x, y
#9 c y y y

We can also do this using only base R with ave

with(df, ave(value, id, FUN = function(x) 
mapply(function(i, j) toString(x[i:j]), seq_along(x), length(x))))

#[1] "x, z, p" "z, p" "p" "q, q" "q" "m, n, x, y" "n, x, y" "x, y" "y"

dplyr grouped cumulative set counting using group_by and rowwise do

You can use the Reduce function with the accumulate mode to create cumulatively distinct elements and then use lengths function to return the cumulative distinct counts, this avoids the rowwise() operation:

library(dplyr)
testdf %>%
arrange(desc(order)) %>%
group_by(id) %>%
mutate(cc = lengths(Reduce(function(x, y) unique(c(x, y)), content, acc = T))) %>%
arrange(id)

#Source: local data frame [6 x 4]
#Groups: id [2]

# id order content cc
# <fctr> <dbl> <list> <int>
#1 a 7 <chr [3]> 3
#2 a 5 <chr [2]> 3
#3 a 3 <chr [2]> 5
#4 b 9 <chr [2]> 2
#5 b 4 <chr [3]> 3
#6 b 1 <chr [2]> 3

Equivalent to cumsum for string in R

(df$B <- Reduce(paste, as.character(df$A), accumulate = TRUE))
# [1] "banana" "banana boats" "banana boats are" "banana boats are awesome"


Related Topics



Leave a reply



Submit