Calculate monthly average of ts object
Ok, so I should given Google one more search before coming to SO as this post is relevant. The cycle()
function appears to be useful for these sorts of things:
> tapply(dat, cycle(dat), mean)
1 2 3 4 5 6 7 8 9
295.33333 277.33333 298.00000 269.33333 276.66667 282.33333 314.00000 323.00000 313.66667
10 11 12
310.33333 291.33333 296.66667
> aggregate(c(dat), list(month = cycle(dat)), mean)
month x
1 1 295.33333
2 2 277.33333
3 3 298.00000
....
Anything else fundamental I'm missing here?
Monthly summary of a time series in r
This might be a bit clunky, and I'm sure there is a better zoo
function for this, but in the spirit of learning something, you can do this with base R functions working on a ts
object:
window(ddd1, c(1960,1), deltat=1)
...will start at 1960.1
(January) and extract a value every 12 months (deltat=12/12=1), which means you can then sapply
this function like:
sapply(1:12, function(x) mean(window(ddd1, c(1960,x), deltat=1)) )
#[1] 49 50 51 52 47 48 49 50 51 52 53 54
R - average of monthly sums of time series
library(hydroTSM)
#This data is daily streamflows, but is similar to Precipitation
data(OcaEnOnaQts)
x <- OcaEnOnaQts
#In case you want monthly precipitation in "precipitation / 30 days" (what is common) you can use
monthlyfunction(x, FUN=mean, na.rm=TRUE) * 30
#In case you want the precipitation per days in specific month you can use
monthlyfunction(x, FUN=mean, na.rm=TRUE) * as.vector(dwi(x, out.unit = "months") * mean(dwi(x)) / sum(dwi(x)))
#or approximately
monthlyfunction(x, FUN=mean, na.rm=TRUE)*c(31,28.25,31,30,31,30,31,31,30,31,30,31)
#Add: Some ways to come to the mean monthly precipitation
p1980 <- c(rep(0,28), 50, 50, 50) #sum = 150
p1981 <- c(rep(0,28), 60, 60, 60) #sum = 180
p1982 <- c(rep(0,28), 30, 30, 30) #sum = 90
#
mean(c(sum(p1980), sum(p1981), sum(p1982))) # = 140 This is how you want it to be calculated
mean(c(p1980, p1981, p1982))*31 # = 140 This is how I suggested to come to the result
#Some other ways to come to the mean monthly precipitation
mean(c(mean(p1980), mean(p1981), mean(p1982)))*31 # = 140
sum(c(p1980, p1981, p1982))/3 # = 140
How to calculate the average year
The lubridate
package could be useful for you. I would use the functions year()
and month()
in conjunction with aggregate()
:
library(xts)
library(lubridate)
#set up some sample data
dates = seq(as.Date('2000/01/01'), as.Date('2005/01/01'), by="month")
df = data.frame(rand1 = runif(length(dates)), rand2 = runif(length(dates)))
my_xts = xts(df, dates)
#get the mean by year
aggregate(my_xts$rand1, by=year(index(my_xts)), FUN=mean)
This outputs something like:
2000 0.5947939
2001 0.4968154
2002 0.4941752
2003 0.5291211
2004 0.6631564
To find the mean for each month you can do:
#get the mean by month
aggregate(my_xts$rand1, by=month(index(my_xts)), FUN=mean)
which will output something like
1 0.5560279
2 0.6352220
3 0.3308571
4 0.6709439
5 0.6698147
6 0.7483192
7 0.5147294
8 0.3724472
9 0.3266859
10 0.5331233
11 0.5490693
12 0.4642588
Calculate Monthly Average With Multiple Records in a Month
If you want the average per month then just group by your current date field.
If you want the average per month regardless of year then you would have to extract the month part of the current date field and group by that.
But your date field now appears to be having string data type; it would be better to use proper date data type. Then your analysis would be much easier, more flexible, better performing.
R - how to calculate global monthly means of a zoo object
1 You can aggregate with a data.table
library(data.table)
# This turns all Jans to 1 and Decs to 12 for example
mth <- month(as.Date(df$date))
dt2 <- as.data.table(df) # turn df into data table dt
dt2[, mth := mth] # pop month into your data frame
setkey(dt2, "mth") # data tables like a key!
# syntax of data table is dt[i, j, by]
# “Take DT, subset rows using i, then calculate j grouped by by”
result <- dt2[, .(meancru = mean(cru), meanmodel = mean(model)), by = mth]
This gets you:
mth meancru meanmodel
1 1 69.21975 70.69146
2 2 67.83230 69.27852
...
If you are not comfortable with data.table
then there is a good cheat sheet here.
2 using base R
Using the mth code above, then:
df <- cbind(df,mth)
result <- aggregate(df, list(mth), mean)
But format needs some cleanup.
If you need to do more with your data, then you will appreciate the speed of data.tables
, ability to index and add new variables.
Annual, monthly or daily mean for irregular time series
Convert your data to an xts object, then use apply.daily
et al to calculate whatever values you want.
library(xts)
d <- structure(list(dates = c("12/03/2012 11:26", "12/03/2012 11:56",
"12/03/2012 12:26"), temperature = c(9.7533, 9.6673, 9.6673),
depth = c(0.48073, 0.33281, 0.33281), salinity = c(37.607,
37.662, 37.672)), .Names = c("dates", "temperature", "depth",
"salinity"), row.names = c(NA, -3L), class = "data.frame")
x <- xts(d[,-1], as.POSIXct(d[,1], format="%m/%d/%Y %H:%M"))
apply.daily(x, colMeans)
# temperature depth salinity
# 2012-12-03 12:26:00 9.695967 0.3821167 37.647
Daily Average of Time series derived from monthly data R monthdays()
I am not able to implement it here.
I'm not sure why you couldn't. The monthdays
function from the forecast package, when applied to a ts
object, returns the number of days in each month of the series. The object returned is a time-series of the same dimension as the input. So you can simply divide them.
library(forecast)
ts/monthdays(ts)
Jan Feb Mar Apr May Jun Jul
2013 766.7742 838.6429 761.3226 762.9667 754.8710 808.0000
2014 863.3871 986.3929 849.7097 910.6667 883.5161 913.9333
2015 914.6452 980.4286 914.0645 966.6000 922.4194 944.6000
2016 974.4839 1057.2069 1017.8710 1008.2667 979.2903 1036.3667
2017 1038.0645 1150.1071 1020.6129 1059.4667 1044.3548 1028.5333
monthsdays(ts) # Accepts a time-series object
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2013 31 28 31 30 31 30 31 31 30 31 30 31
2014 31 28 31 30 31 30 31 31 30 31 30 31
2015 31 28 31 30 31 30 31 31 30 31 30 31
2016 31 29 31 30 31 30 31 31 30 31 30 31
2017 31 28 31 30 31 30 31 31 30 31 30 31
Compute daily, month and annual average of several data sets
When working with hydro-meteorological data, I usually use xts
and hydroTSM
packages as they have many functions for data aggregation.
You didn't provide any data so I created one for demonstration purpose
library(xts)
library(hydroTSM)
# Generate random data
set.seed(2018)
date = seq(from = as.Date("2016-01-01"), to = as.Date("2018-12-31"),
by = "days")
temperature = runif(length(date), -15, 35)
dat <- data.frame(date, temperature)
# Convert to xts object for xts & hydroTSM functions
dat_xts <- xts(dat[, -1], order.by = dat$date)
# All daily, monthly & annual series in one plot
hydroplot(dat_xts, pfreq = "dma", var.type = "Temperature")
# Weekly average
dat_weekly <- apply.weekly(dat_xts, FUN = mean)
plot(dat_weekly)
# Monthly average
dat_monthly <- daily2monthly(dat_xts, FUN = mean, na.rm = TRUE)
plot.zoo(dat_monthly, xaxt = "n", xlab = "")
axis.Date(1, at = pretty(index(dat_monthly)),
labels = format(pretty(index(dat_monthly)), format = "%b-%Y"),
las = 1, cex.axis = 1.1)
# Seasonal average: need to specify the months
dat_seasonal <- dm2seasonal(dat_xts, season = "DJF", FUN = mean, na.rm = TRUE)
plot(dat_seasonal)
# Annual average
dat_annual <- daily2annual(dat_xts, FUN = mean, na.rm = TRUE)
plot(dat_annual)
Edit: using OP's data
df <- readr::read_csv2("Temp_2014_Hour.csv")
str(df)
# Convert DATE to Date object & put in a new column
df$date <- as.Date(df$DATE, format = "%d/%m/%Y")
dat <- df[, c("date", "VALUE")]
str(dat)
dat_xts <- xts(dat[, -1], order.by = dat$date)
Created on 2018-02-28 by the reprex package (v0.2.0).
Compute monthly averages from daily data
One way, using base
R would be to make sure your dates are of class Date
or similar ( e.g. POSIXct) if you haven't already, and then to extract the months and years (as your data spans more than one year) and aggregate like so:
# Convert to date if not already
df1$X1 <- as.Date(df1$X1)
# Get months
df1$Month <- months(df1$X1)
# Get years
df1$Year <- format(df1$X1,format="%y")
# Aggregate 'X2' on months and year and get mean
aggregate( X2 ~ Month + Year , df1 , mean )
# Month Year X2
#1 December 09 0.0000000
#2 February 10 0.1714286
#3 January 10 1.2074074
There are quite a few ways of doing this if you have a look around.
Related Topics
How to Manually Change the Key Labels in a Legend in Ggplot2
How to Order Bars Within All Facets
Partially Read Really Large CSV.Gz in R Using Vroom
Creating Accompanying Slides for Bookdown Project
Shapes and Linetypes in Ggplot
Show Element Values in Barplot
Reshape Data Long to Wide - Understanding Reshape Parameters
C5.0 Decision Tree - C50 Code Called Exit with Value 1
Why Does Rm Inside a Function Not Delete Objects
Find All Positions of All Matches of One Vector of Values in Second Vector
Loess Regression on Each Group with Dplyr::Group_By()
Display a Summary Line Per Facet Rather Than Overall
Dplyr String as Column Reference
Finding Overlapping Ranges Between Two Interval Data
Can 'Ddply' (Or Similar) Do a Sliding Window
Print to PDF File Using Grid.Table in R - Too Many Rows to Fit on One Page