Get Dates of a Certain Weekday from a Year in R
EDIT: On further reflection, here's a cleaner function for doing the same thing:
getAllMondays <- function(year) {
days <- as.POSIXlt(paste(year, 1:366, sep="-"), format="%Y-%j")
Ms <- days[days$wday==1]
Ms[!is.na(Ms)] # Needed to remove NA from day 366 in non-leap years
}
getAllMondays(2012)
Here's a function that'll perform the more general task of finding the first Monday in an arbitrary year, and then listing it and all of the other Mondays in that year. It uses seq.POSIXt()
, and the argument by = "week"
(which is also available for seq.Date()
).
getAllMondays <- function(year) {
day1 <- as.POSIXlt(paste(year, "01-01", sep="-"))
day365 <- as.POSIXlt(paste(year, "12-31", sep="-"))
# Find the first Monday of year
week1 <- as.POSIXlt(seq(day1, length.out=7, by="day"))
monday1 <- week1[week1$wday == 1]
# Return all Mondays in year
seq(monday1, day365, by="week")
}
head(getAllMondays(2012))
# [1] "2012-01-02 PST" "2012-01-09 PST" "2012-01-16 PST" "2012-01-23 PST"
# [5] "2012-01-30 PST" "2012-02-06 PST"
Find the day of a week
df = data.frame(date=c("2012-02-01", "2012-02-01", "2012-02-02"))
df$day <- weekdays(as.Date(df$date))
df
## date day
## 1 2012-02-01 Wednesday
## 2 2012-02-01 Wednesday
## 3 2012-02-02 Thursday
Edit: Just to show another way...
The wday
component of a POSIXlt
object is the numeric weekday (0-6 starting on Sunday).
as.POSIXlt(df$date)$wday
## [1] 3 3 4
which you could use to subset a character vector of weekday names
c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday")[as.POSIXlt(df$date)$wday + 1]
## [1] "Wednesday" "Wednesday" "Thursday"
Subset dates with a given weekday and select next date if weekday is missing
An alternative with findInterval
.
Create a sequence of dates ('tmp'), from the focal weekday ('wd') in the week of min
'dates', to max
'dates'.
Select dates corresponding to the focal weekday ('wds').
Select working days from 'dates' ('dates_1_5').
Use findInterval
to roll 'wds' to closest available working day in 'dates_1_5'.
f <- function(wd, dates){
tmp <- seq(as.Date(paste(format(min(dates), "%Y-%W"), wd, sep = "-"),
format = "%Y-%W-%u"),
max(dates), by = 1)
wds <- tmp[as.integer(format(tmp, "%u")) == wd]
dates_1_5 <- dates[as.integer(format(dates, "%u")) %in% 1:5]
dates_1_5[findInterval(wds, dates_1_5, left.open = TRUE) + 1]
}
Some examples:
d <- seq.Date(as.Date("2017-11-16"), as.Date("2017-11-24"), by = 1)
dates <- d[d != as.Date("2017-11-23")]
f(wd = 4, dates)
# [1] "2017-11-16" "2017-11-24"
dates <- d[d != as.Date("2017-11-16")]
f(wd = 4, dates)
# [1] "2017-11-17" "2017-11-23"
dates <- d[!(d %in% as.Date(c("2017-11-16", "2017-11-17", "2017-11-21", "2017-11-23")))]
f(wd = 2, dates)
# [1] "2017-11-20" "2017-11-22"
Slightly more compact using a data.table
rolling join:
library(data.table)
wd <- 2
# using 'dates' from above
d1 <- data.table(dates)
d2 <- data.table(dates = seq(as.Date(paste(format(min(dates), "%Y-%W"), wd, sep = "-"),
format = "%Y-%W-%u"),
max(dates), by = 1))
d1[wday(dates) %in% 2:6][d2[wday(dates) == wd + 1],
on = "dates", .(x.dates), roll = -Inf]
...or a non-equi join:
d1[wday(dates) %in% 2:6][d2[wday(dates) == wd + 1],
on = .(dates >= dates), .(x.dates), mult = "first"]
If desired, just wrap in a function as above.
Calculate the number of weekdays between 2 dates in R
Date1 <- as.Date("2011-01-30")
Date2 <- as.Date("2011-02-04")
sum(!weekdays(seq(Date1, Date2, "days")) %in% c("Saturday", "Sunday"))
EDIT: And Zach said, let there be Vectorize
:)
Dates1 <- as.Date("2011-01-30") + rep(0, 10)
Dates2 <- as.Date("2011-02-04") + seq(0, 9)
Nweekdays <- Vectorize(function(a, b)
sum(!weekdays(seq(a, b, "days")) %in% c("Saturday", "Sunday")))
Nweekdays(Dates1, Dates2)
How to row bind all cases of a particular weekday in a given year-month into an R dataset
Here is an option:
alldays[dt1[, .(id, zip, admit=0L, year, month, dow)],
on=.(year, month, dow), allow.cartesian=TRUE][
dt1, on=.(id, date=Adate), admit := i.admit][]
output:
date year month day dow id zip admit
1: 2010-07-01 2010 7 1 Thursday 1 54123 0
2: 2010-07-08 2010 7 8 Thursday 1 54123 0
3: 2010-07-15 2010 7 15 Thursday 1 54123 1
4: 2010-07-22 2010 7 22 Thursday 1 54123 0
5: 2010-07-29 2010 7 29 Thursday 1 54123 0
6: 2011-03-07 2011 3 7 Monday 2 54789 0
7: 2011-03-14 2011 3 14 Monday 2 54789 1
8: 2011-03-21 2011 3 21 Monday 2 54789 0
9: 2011-03-28 2011 3 28 Monday 2 54789 0
Transform year/week to date object
Before converting year-week to a date you have to specify a day of the week but more importantly you have to ensure which of the different conventions is being used.
Base R's strptime()
function knows 3 definitions of week of the year (but supports only 2 of them on input) and 2 definitions of weekday number,
see ?strptime
:
Week of the year
US convention: Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1):
%U
UK convention: Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1):
%W
ISO 8601 definition: Week of the year as decimal number (01–53) as defined in ISO 8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1:
%V
which is accepted but ignored on input.
Note that there is also a week-based year (%G
and%g
) which is to be used with%V
as it may differ from the calendar year (%Y
and%y
).
Numeric weekday
- Weekday as a decimal number (1–7, Monday is 1):
%u
- Weekday as decimal number (0–6, Sunday is 0):
%w
- Interestingly, there is no format for the case Sunday is counted as day 1 of the week.
Converting year-week-day with the different conventions
If we append day 1 to the string and use the different formats we do get
as.Date("2015101", "%Y%U%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%U%w")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%w")
# [1] "2015-03-09"
as.Date("2015101", "%G%V%u")
# [1] NA
For weekday formats %u
and %w
we do get the same result because day 1 is Monday in both conventions (but watch out when dealing with Sundays).
For 2015, the US and the UK definition for week of the year coincide but this is not true for all years, e.g., not for 2001, 2007, and 2018:
as.Date("2018101", "%Y%U%u")
#[1] "2018-03-12"
as.Date("2018101", "%Y%W%u")
#[1] "2018-03-05"
The ISO 8601 format specifiers aren't supported on input. Therefore, I had created the ISOweek
package some years ago:
ISOweek::ISOweek2date("2015-W10-1")
#[1] "2015-03-02"
Edit: Using Thursday to associate a week with a month
As mentioned above you need to specify a day of the week to get a full calendar date. This is also required if the dates need to be aggregated by month later on.
If no weekday is specified and if the dates are supposed to be aggregated by month later on, you may take the Thursday of each week as reference day (following a suggestion by djhurio). This ensures that the whole week is assigned to the month to which the majority of the days of the week belong to.
For example, taking Sunday as reference day would return
ISOweek::ISOweek2date("2015-W09-7")
[1] "2015-03-01"
which consequently would associate the whole week to the month of March although only one day of the week belongs to March while the other 6 days belong to February. Taking Thursday as reference day will return a date in February:
ISOweek::ISOweek2date("2015-W09-4")
[1] "2015-02-26"
Related Topics
Compute Projection/Hat Matrix via Qr Factorization, Svd (And Cholesky Factorization)
How to Prep Transaction Data into Basket for Arules
Manipulating Files with Non-English Names in R
Differencebetween Short (&,|) and Long (&&, ||) Forms of And, or Logical Operators in R
Blend of Na.Omit and Na.Pass Using Aggregate
Plot a Character Vector Against a Numeric Vector in R
How to Split Data Frame by Column Names in R
Can Sparklyr Be Used with Spark Deployed on Yarn-Managed Hadoop Cluster
Harvest (Rvest) Multiple HTML Pages from a List of Urls
Adding Multiple Lag Variables Using Dplyr and for Loops
How to Run a R Language(.R) File Using Batch File
Adding Multiple Columns in a Dplyr Mutate Call
Nas Are Not Allowed in Subscripted Assignments
Embedding Googlevis Charts into a Web Site
Warning "The Condition Has Length > 1 and Only the First Element Will Be Used"