Grouped Moving Average in R

Rolling mean (moving average) by group/id with dplyr

If you are not committed to to dplyr this should work:

get.mav <- function(bp,n=2){
  require(zoo)
  if(is.na(bp[1])) bp[1] <- mean(bp,na.rm=TRUE)
  bp <- na.locf(bp,na.rm=FALSE)
  if(length(bp)<n) return(bp)
  c(bp[1:(n-1)],rollapply(bp,width=n,mean,align="right"))  
}
test <- with(test,test[order(ID,YEAR_VISIT),])

test$BLOOD_PRESSURE_UPDATED <- 
  unlist(aggregate(BLOOD_PRESSURE~ID,test,get.mav,na.action=NULL,n=2)$BLOOD_PRESSURE)
test
#    ID AGE YEAR_VISIT BLOOD_PRESSURE TREATMENT BLOOD_PRESSURE_UPDATED
# 1   1  20       2000             NA         3               134.6667
# 2   1  21       2001            129         2               131.8333
# 3   1  22       2002            145         3               137.0000
# 4   1  22       2002            130         2               137.5000
# 5   2  23       2003             NA        NA               130.0000
# 6   2  30       2010            150         2               140.0000
# 7   2  31       2011            110         3               130.0000
# ...

This works for moving averages > 2 as well.

And here's a data.table solution, which is likely to be much faster if your dataset is large.

library(data.table)
setDT(test)     # converts test to a data.table in place
setkey(test,ID,YEAR_VISIT)
test[,BLOOD_PRESSURE_UPDATED:=as.numeric(get.mav(BLOOD_PRESSURE,2)),by=ID]
test
#    ID AGE YEAR_VISIT BLOOD_PRESSURE TREATMENT BLOOD_PRESSURE_UPDATED
#  1:  1  20       2000             NA         3               134.6667
#  2:  1  21       2001            129         2               131.8333
#  3:  1  22       2002            145         3               137.0000
#  4:  1  22       2002            130         2               137.5000
#  5:  2  23       2003             NA        NA               130.0000
#  6:  2  30       2010            150         2               140.0000
#  7:  2  31       2011            110         3               130.0000
# ...

7 Day Moving Average per group - R

You didn't supply enough data to create a weekly rolling mean within the groups, but in principle it could work like this:

library(tidyverse)
library(zoo)

my_data <- my_data %>% 
  group_by(ID) %>% 
  mutate(roll_mean = rollmean(Count, 2, na.pad = T))

Using dplyr you group_by your ID variable, and then create a single new column with the rolling mean. You can plot this then with standard ggplot2-syntax:

ggplot(my_data, aes(Date, Count, group = 1)) +
  geom_line(colour = "blue") +
  geom_point(colour = "blue") +
  geom_point(aes(y = roll_mean), colour = "red") +
  facet_wrap(~ID)
#> Warning: Removed 3 rows containing missing values (geom_point).

Sample Image

Data

zzz <- "YYYYMM    Date         ID    Count
201401    01/01/2014   A     151
201401    01/01/2014   B     68
201401    01/01/2014   C     487
201401    02/01/2014   A     198
201401    02/01/2014   B     97
201401    02/01/2014   C     403"

my_data <- read_table(zzz)

Grouped moving average in r

Here is a rollapply solution. Note that it appears that you want the average of the prior two or three rows in the same group, i.e. excluding the data on the current row.

library(zoo)

roll <- function(x, n) { 
   if (length(x) <= n) NA 
   else rollapply(x, list(-seq(n)), mean, fill = NA)
}
transform(DF, AVG2 = ave(Score, school, Student, FUN = function(x) roll(x, 2)),
              AVG3 = ave(Score, school, Student, FUN = function(x) roll(x, 3)))

giving:

   school Student Score AVG2     AVG3
1       I       S     5   NA       NA
2       B       S     2   NA       NA
3       B       S     7   NA       NA
4       B       O     3   NA       NA
5       B       O     9   NA       NA
6       I       O     6   NA       NA
7       I       O     3   NA       NA
8       I       S     7   NA       NA
9       I       O     1  4.5       NA
10      B       S     7  4.5       NA
11      I       S     3  6.0       NA
12      I       O     8  2.0 3.333333
13      B       S     3  7.0 5.333333
14      I       O     4  4.5 4.000000
15      B       O     1  6.0       NA
16      I       S     9  5.0 5.000000
17      B       S     4  5.0 5.666667
18      B       O     6  5.0 4.333333
19      I       S     3  6.0 6.333333
20      I       O     8  6.0 4.333333
21      B       S     3  3.5 4.666667
22      I       O     4  6.0 6.666667
23      B       O     1  3.5 5.333333
24      I       S     9  6.0 5.000000
25      B       S     4  3.5 3.333333
26      B       O     6  3.5 2.666667
27      I       J     6   NA       NA

Update: Fixed roll.

Calculating (something similar to) moving averages with grouped data in R?

There are several problems:

By writing flights$ the code is telling it to override the grouping and use the original ungrouped vector. Remove flights$ .
summarize is used when one row per group is desired but here it appears we want a result having the same number of rows as the input so use mutate rather than summarize.
There are unneeded parentheses here and while they are not wrong it makes it harder to read. When expressions are potentially ambiguous or rely on rules the reader may have to look up it is a good idea to use extra parentheses but that is not the situation here.
ungroup at the end so we are not left with a grouped data frame.
dplyr clobbers lag and filter in base R so it will conflict with many other packages. Always exclude these in the library statement. This does not affect the code here since neither of those are used but as a precaution I always do that.
Seems unnecessary to load all of the tidyverse when the code is only using dplyr and its dependencies.

library(dplyr, exclude = c("lag", "filter"))
library(nycflights13)
library(zoo)

delay_rate <- flights %>% 
  group_by(year, month, day) %>%
  mutate(delay_rate = rollsumr(dep_delay, k = 7, fill = NA) / 
    rollsumr(arr_delay, k = 7, fill = NA)) %>%
  ungroup

Moving average with grouped data

It's a little ambiguous, but I think you want this:

test <- cbind(time=rownames(test), test)  # first add a time variable

# then create a list with rolling mean for each id and time
ls1 <- lapply(seq_along(test$time), 
              function(x) cbind(time=x,  # time variable
                                with(test[test$time %in% 1:x, ], 
                                     aggregate(list(VES_2A=VES_2A), 
                                               list(Index=Index), mean))  # rolling mean
                                ))

tot <- transform(t(sapply(ls1, colMeans)), Index="total")  # occasionally add a total column

long <- rbind(do.call(rbind, ls1), tot)  # bind all rows together into long format data frame
wide <- reshape2::dcast(long, time ~ Index)  # reshape to wide w/ e.g. reshape2::dcast()
rm(ls1, tot)  # clean up

Yielding

> wide
   time    1        2        3        5        6    total
1     1 1.00       NA       NA       NA       NA 1.000000
2     2 1.00       NA       NA       NA       NA 1.000000
3     3 1.00       NA       NA       NA       NA 1.000000
4     4 0.75       NA       NA       NA       NA 0.750000
5     5 0.75 3.000000       NA       NA       NA 1.875000
6     6 0.75 2.000000       NA       NA       NA 1.375000
7     7 0.75 1.666667       NA       NA       NA 1.208333
8     8 0.75 1.500000       NA       NA       NA 1.125000
9     9 0.75 1.500000 3.000000       NA       NA 1.750000
10   10 0.75 1.500000 2.500000       NA       NA 1.583333
11   11 0.75 1.500000 2.333333       NA       NA 1.527778
12   12 0.75 1.500000 1.750000       NA       NA 1.333333
13   13 0.75 1.500000 1.750000 2.000000       NA 1.500000
14   14 0.75 1.500000 1.750000 1.000000       NA 1.250000
15   15 0.75 1.500000 1.750000 1.333333       NA 1.333333
16   16 0.75 1.500000 1.750000 1.250000       NA 1.312500
17   17 0.75 1.500000 1.750000 1.250000 1.000000 1.250000
18   18 0.75 1.500000 1.750000 1.250000 1.500000 1.350000
19   19 0.75 1.500000 1.750000 1.250000 1.333333 1.316667
20   20 0.75 1.500000 1.750000 1.250000 1.250000 1.300000

Plot

library(ggplot2)
ggplot(long, aes(time, VES_2A, color=Index)) +
  geom_line()

Sample Image

Tell me what you think, hope that's what you've wanted.

Data

test <- structure(list(VES_2A = c(1L, 1L, 1L, 0L, 3L, 1L, 1L, 1L, 3L, 
                                  2L, 2L, 0L, 2L, 0L, 2L, 1L, 1L, 2L, 1L, 1L), Index = c(1L, 1L, 
                                                                                         1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 
                                                                                         6L, 6L)), class = "data.frame", row.names = c(NA, -20L))

Moving Averages on multiple columns - Grouped Data

We could use rollmean from the zoo package, in combination with data.table .

library(data.table)
library(zoo)
setDT(df)[, c("Qty.mean","To.mean") := lapply(.SD, rollmean, k = 3, fill = NA, align = "right"), 
              .SDcols = c("Qty","To"), by = Section]
 > df
 #   Week Section      Qty       To Qty.mean  To.mean
 #1:    1       a 145.4814 73.49183       NA       NA
 #2:    2       a 348.9198 51.44893       NA       NA
 #3:    3       a 343.7099 50.67283 279.3703 58.53786
 #4:    4       a 349.3518 47.46891 347.3271 49.86356
 #5:    5       a 444.3662 49.28904 379.1426 49.14359
 #6:    1       b 356.1242 52.66450       NA       NA
 #7:    2       b 103.7983 52.10773       NA       NA
 #8:    3       b 193.0202 46.36184 217.6476 50.37802
 #9:    4       b 366.4335 41.59984 221.0840 46.68980
#10:    5       b 305.7005 48.75198 288.3847 45.57122
#11:    1       c 377.4365 72.42394       NA       NA
#12:    2       c 317.9899 61.02790       NA       NA
#13:    3       c 213.0934 76.58633 302.8400 70.01272
#14:    4       c 469.3734 73.25380 333.4856 70.28934
#15:    5       c 216.9263 41.83081 299.7977 63.89031