Compare two dates in R
The following solution solved my problem:
Instead of using the Date
data type, I tried to use the POSIXct
data type.
Here is the example code for reading the tab-separated textfile after which the subsetting worked in all steps of my for
loop:
data = read.table("data.txt", header = TRUE, sep = "\t", dec = ".",
colClasses =c("numeric","numeric","character","POSIXct","numeric","numeric"));
startDate = as.POSIXct("2012-07-01");
endDate = as.POSIXct("2012-07-20");
all_dates = seq(startDate, endDate, 86400); #86400 is num of seconds in a day
#the following code I'm trying to run inside a loop...
for (j in 1:length(all_dates)) {
filterdate = all_dates[j];
my_subset = data[data$DateTimeUTC == filterdate,]
#now I want do do some processing on my_subset...
}
subset dataset based on date comparison R
You can just use regular comparison
dat[dat$Col3 <= dat$CutoffDate, ]
# Col1 Col2 Col3 CutoffDate
# 3 12001 Yes 2008-08-10 2008-08-10
# 4 12001 Yes 2008-08-04 2008-08-10
Assuming Col3 and CuttoffDate are class "Date"
or maybe preferably,
with(dat, dat[Col3 <= CutoffDate, ])
Date comparison with System Date in R
To get a more specific answer, make a reproducible example
Convert the date column from character to date-time objects, e.g., with
library(lubridate)
your_df$end_date <- mdy(your_df$end_date)
Then, you don't even need a column for todays date, just use it as a filter condition
library(dplyr)
filter(your_df, end_date < Sys.Date())
# will return a data frame with those rows that have a date before today.
Or if you prefer:
your_df[your_df$end_date < Sys.Date(),]
# produces the same rows
R compare date from one column with dates in many columns
An option with rowSums
would be to select
the 'Date' columns, do a comparison with 'ReferenceDate' column, check whether the rowSums
output is not equal to 0, convert the logical to numeric index (add 1) and use that to replace the values with 'Yes', 'No'
nm1 <- grep('^DateCol', names(df1), value = TRUE)
Or if the column names are not 'DateCol' as patterns, may be
nm1 <- setdiff(names(df1), c("ID", "ReferenceDate"))
df1$flag <- c("No", "Yes")[(rowSums(df1[nm1] > df1$ReferenceDate) != 0) + 1]
R dates comparison using loop
You definitely want to use vectorised functions for this, check out the dplyr
package:
df %>%
mutate(death_check = case_when(Death.date < as.Date("2021-10-28") ~ "good"))
As you can see I added ""
around the date as well, this is neccessary. If your df$Death.date
is not actually in Date
format you can change that here as well.
R: Extract data based on date, if date lesser than
It looks like you are not casting the comparison values as dates. Also the dates you used for comparison don't exclude any of the dates in the dataframe you provided so I'd expect the mean to be selected every time.
date <- as.Date(c('2013-05-01', '2013-05-02', '2013-05-03'))
x <- c(1, 2, 3)
y <- c(2, 2, 2)
mean <- (x + y)/2
df <- data.frame(date = date, x = x, y = y)
newdata <- ifelse((df$date < as.Date('2013-05-02') | df$date > as.Date('2014-04-09')), mean, x)
newdata
I changed the dates in the condition to be more selective and I got 1.5 2.0 3.0
. It selects the first value from mean
and the others from x
which agrees with the condition I used in the ifelse()
.
Date comparison R
Instead of ifelse
use fifelse
library(data.table)
dt[, date_aux := fifelse(date_hist>max(date_hist),ymd(as.Date(max(date_hist),
format = "%Y-%m-%d")),ymd(as.Date(date_hist,format = "%Y-%m-%d")))]
str(dt)
#Classes ‘data.table’ and 'data.frame': 24 obs. of 2 variables:
# $ date_hist: Date, format: "2018-01-01" "2018-02-01" "2018-03-01" "2018-04-01" ...
# $ date_aux : Date, format: "2018-01-01" "2018-02-01" "2018-03-01" "2018-04-01" ...
with ifelse
, the dates are converted to its storage mode i.e. numeric
If we check the source code, it is at the last few lines of assignment that creates the issue
...
ans <- test
len <- length(ans)
ypos <- which(test)
npos <- which(!test)
if (length(ypos) > 0L)
ans[ypos] <- rep(yes, length.out = len)[ypos]
if (length(npos) > 0L)
ans[npos] <- rep(no, length.out = len)[npos]
ans
...
With a simple example
v1 <- Sys.Date() - 1:5
v2 <- Sys.Date() + 1:5
ans <- v1 > Sys.Date() - 2 # logical vector
ypos <- which(ans)
npos <- which(!ans)
ans[ypos] <- rep(v2, length.out = length(ans))[ypos] Datess
ans
#[1] 18335 0 0 0 0
assigning Date
class on the logical vector coerces the Date to convert to numeric
Compare dates in a dataframe column with a single date
There are vectorised functions available in R to do this instead of using sapply
. In this case, you can use pmax
-
df$date <- as.Date(df$date)
compare_date=as.Date("2018-08-02")
df$date <- pmax(df$date, compare_date)
df
# date
#1 2018-08-02
#2 2018-08-02
#3 2018-08-02
#4 2018-08-03
data
df <- structure(list(date = c("2018-07-31", "2018-08-01", "2018-08-02",
"2018-08-03")), class = "data.frame", row.names = c(NA, -4L))
Related Topics
Subtracting Two Columns to Give a New Column in R
Sum Across Multiple Columns With Dplyr
Simultaneously Merge Multiple Data.Frames in a List
Aggregating by Unique Identifier and Concatenating Related Values into a String
Select/Assign to Data.Table When Variable Names Are Stored in a Character Vector
How to Assign Colors to Categorical Variables in Ggplot2 That Have Stable Mapping
Can Dplyr Package Be Used For Conditional Mutating
R: Rjava Package Install Failing
Remove Unwanted Symbols from Expression Function - R
Aggregate/Summarize Multiple Variables Per Group (E.G. Sum, Mean)
Mean Per Group in a Data.Frame
Convert Continuous Numeric Values to Discrete Categories Defined by Intervals
Does Ifelse Really Calculate Both of Its Vectors Every Time? Is It Slow
Compare Two Data.Frames to Find the Rows in Data.Frame 1 That Are Not Present in Data.Frame 2
Is There an R Function For Finding the Index of an Element in a Vector