Check if a date is within an interval in R
Everybody has their favourite tool for this, mine happens to be data.table because of what it refers to as its dt[i, j, by]
logic.
library(data.table)
dt <- data.table(date = as.IDate(pt))
dt[, YR := 0.0 ] # I am using a numeric for year here...
dt[ date >= as.IDate("2002-09-01") & date <= as.IDate("2003-08-31"), YR := 1 ]
dt[ date >= as.IDate("2003-09-01") & date <= as.IDate("2004-08-31"), YR := 2 ]
dt[ date >= as.IDate("2004-09-01") & date <= as.IDate("2005-08-31"), YR := 3 ]
I create a data.table
object, converting your times to date for later comparison. I then set up a new column, defaulting to one.
We then execute three conditional statements: for each of the three intervals (which I just create by hand using the endpoints), we set the YR
value to 1, 2 or 3.
This does have the desired effect as we can see from
R> print(dt, topn=5, nrows=10)
date YR
1: 2003-06-11 1
2: 2004-08-11 2
3: 2004-06-03 2
4: 2004-01-20 2
5: 2005-02-25 3
---
96: 2002-08-07 0
97: 2004-02-04 2
98: 2006-04-10 0
99: 2005-03-21 3
100: 2003-12-01 2
R> table(dt[, YR])
0 1 2 3
26 31 31 12
R>
One could have done this also simply by computing date differences and truncating down, but it is also nice to be a little explicit at times.
Edit: A more generic form just uses arithmetic on the dates:
R> dt[, YR2 := trunc(as.numeric(difftime(as.Date(date),
+ as.Date("2001-09-01"),
+ unit="days"))/365.25)]
R> table(dt[, YR2])
0 1 2 3 4 5 6 7 9
7 31 31 12 9 5 1 2 1
R>
This does the job in one line.
How to Check if a Date is Within a List of Intervals in R?
First, note that you can test whether any element from a vector of "negative" dates falls within the "positive" interval like so:
any(dates.neg %within% interval(dates.pos[1], dates.pos[1] + days(2)))
# [1] FALSE
This suggests the following approach using map2
-- or more usefully, map2_lgl
:
df.TOTAL <- df.POS %>%
left_join(df.NEG, by = 'ID') %>%
mutate(TIME = interval(DATE, DATE + days(2)),
RESULT = map2_lgl(data, TIME, ~any(.x$DATE %within% .y)))
# # A tibble: 6 x 5
# ID DATE data TIME RESULT
# <chr> <date> <list> <S4: Interval> <lgl>
# 1 ID_1 2018-02-07 <tibble [5 x 1]> 2018-02-07 UTC--2018-02-09 UTC FALSE
# 2 ID_1 2018-02-12 <tibble [5 x 1]> 2018-02-12 UTC--2018-02-14 UTC FALSE
# 3 ID_1 2018-02-13 <tibble [5 x 1]> 2018-02-13 UTC--2018-02-15 UTC FALSE
# 4 ID_1 2018-02-20 <tibble [5 x 1]> 2018-02-20 UTC--2018-02-22 UTC TRUE
# 5 ID_1 2018-02-21 <tibble [5 x 1]> 2018-02-21 UTC--2018-02-23 UTC TRUE
# 6 ID_1 2018-03-18 <tibble [5 x 1]> 2018-03-18 UTC--2018-03-20 UTC FALSE
Thanks to @ubutun for improving the answer.
How to check if several dates lie within a interval using data.table and lubridate?
An alternative:
DT[, new := rowSums(sapply(.SD, between, initial, final)) > 0,
.SDcols = c("d1", "d2", "d3")]
DT
# d1 d2 d3 initial final new
# <Date> <Date> <Date> <Date> <Date> <lgcl>
# 1: 2019-01-01 2019-03-01 2020-01-01 2020-01-01 2020-03-01 TRUE
# 2: 2019-02-02 2022-02-02 2021-02-02 2022-05-05 2023-01-01 FALSE
Make sure you're using data.table::between
... if you have a conflict between that and dplyr::between
, the latter will complain (since it requires that its lower/upper bounds are length-1).
This answer is both vectorized and as efficient as one can be with an arbitrary number of columns. That is, it will call between
once per column (vectorized), and rowSums
only once regardless of the number of columns. (Also, rowSums(.)
is generally faster than apply(., 1, any)
or similar canonical R methods.)
Checking if a date falls within an interval in a dataset in long format
You can do the following:
library(dplyr)
tests %>%
left_join(meds) %>%
group_by(id) %>%
mutate(received_med_within = between(med_date, test_date[1], test_date[2])) %>%
tidyr::replace_na(list(received_med_within = FALSE)) %>%
dplyr::select(-4)
# A tibble: 6 x 4
# Groups: id [3]
# id test_date test_result received_med_within
# <dbl> <date> <dbl> <lgl>
# 1 1 2000-01-01 1 TRUE
# 2 1 2000-01-07 1 TRUE
# 3 2 2000-02-14 0 FALSE
# 4 2 2000-03-19 1 FALSE
# 5 3 2000-05-14 0 FALSE
# 6 3 2000-09-30 0 FALSE
How to find if a date is within a given time interval
Assuming your data is already converted with lubridate,
input<- df %>%
mutate(start_date=ymd(start_date)) %>%
mutate(end_date=ymd(end_date)) %>%
mutate(a_date=ymd(a_date)) %>%
mutate(b_date=ymd(b_date)) %>%
mutate(c_date=ymd(c_date)) %>%
mutate(Intrvl=interval(start_date, end_date))
you could use the %within% operator in lubridate
result <- input %>%
mutate(AinIntrvl=if_else(a_date %within% Intrvl,"a","")) %>%
mutate(BinIntrvl=if_else(b_date %within% Intrvl,"b","")) %>%
mutate(CinIntrvl=if_else(c_date %within% Intrvl,"c","")) %>%
mutate(Within_Intrvl=paste(AinIntrvl,BinIntrvl,CinIntrvl,sep="_")) %>%
select(-start_date,-end_date,-Intrvl,-a_date,-b_date,-c_date )
You can format the Within_Intrvl column as you like, and well as decide how you want to deal with NAs
Related Topics
How to Retry a Statement on Error
Subsetting a Matrix by Row.Names
Extract File Extension from File Path
How to Cross-Paste All Combinations of Two Vectors (Each-To-Each)
Warning Message: Line Appears to Contain Embedded Nulls
Grouped Operations That Result in Length Not Equal to 1 or Length of Group in Dplyr
How to Remove + (Plus Sign) from String in R
How to Apply a Function to a Certain Column for All the Data Frames in Environment in R
Error: Vector Memory Exhausted (Limit Reached) R 3.5.0 MACos
How to Embed an Image in a Cell a Table Using Dt, R and Shiny
Ggplot Scale Color Gradient to Range Outside of Data Range
Edit Datatable in Shiny with Dropdown Selection for Factor Variables
How to Generalize Outer to N Dimensions
Flip Ordering of Legend Without Altering Ordering in Plot