Date Comparison in R

Compare two dates in R

The following solution solved my problem:
Instead of using the Date data type, I tried to use the POSIXct data type.
Here is the example code for reading the tab-separated textfile after which the subsetting worked in all steps of my for loop:

data = read.table("data.txt", header = TRUE, sep = "\t", dec = ".", 
colClasses =c("numeric","numeric","character","POSIXct","numeric","numeric"));
startDate = as.POSIXct("2012-07-01");
endDate = as.POSIXct("2012-07-20");
all_dates = seq(startDate, endDate, 86400); #86400 is num of seconds in a day

#the following code I'm trying to run inside a loop...
for (j in 1:length(all_dates)) {
filterdate = all_dates[j];
my_subset = data[data$DateTimeUTC == filterdate,]
#now I want do do some processing on my_subset...
}

subset dataset based on date comparison R

You can just use regular comparison

dat[dat$Col3 <= dat$CutoffDate, ]
# Col1 Col2 Col3 CutoffDate
# 3 12001 Yes 2008-08-10 2008-08-10
# 4 12001 Yes 2008-08-04 2008-08-10

Assuming Col3 and CuttoffDate are class "Date"

or maybe preferably,

with(dat, dat[Col3 <= CutoffDate, ])

Date comparison with System Date in R

To get a more specific answer, make a reproducible example

Convert the date column from character to date-time objects, e.g., with

library(lubridate)
your_df$end_date <- mdy(your_df$end_date)

Then, you don't even need a column for todays date, just use it as a filter condition

library(dplyr)
filter(your_df, end_date < Sys.Date())
# will return a data frame with those rows that have a date before today.

Or if you prefer:

your_df[your_df$end_date < Sys.Date(),]
# produces the same rows

R compare date from one column with dates in many columns

An option with rowSums would be to select the 'Date' columns, do a comparison with 'ReferenceDate' column, check whether the rowSums output is not equal to 0, convert the logical to numeric index (add 1) and use that to replace the values with 'Yes', 'No'

nm1 <- grep('^DateCol', names(df1), value = TRUE)

Or if the column names are not 'DateCol' as patterns, may be

nm1 <- setdiff(names(df1), c("ID", "ReferenceDate"))
df1$flag <- c("No", "Yes")[(rowSums(df1[nm1] > df1$ReferenceDate) != 0) + 1]

R dates comparison using loop

You definitely want to use vectorised functions for this, check out the dplyr package:

df %>%
mutate(death_check = case_when(Death.date < as.Date("2021-10-28") ~ "good"))

As you can see I added "" around the date as well, this is neccessary. If your df$Death.date is not actually in Date format you can change that here as well.

R: Extract data based on date, if date lesser than

It looks like you are not casting the comparison values as dates. Also the dates you used for comparison don't exclude any of the dates in the dataframe you provided so I'd expect the mean to be selected every time.

date <- as.Date(c('2013-05-01', '2013-05-02', '2013-05-03'))
x <- c(1, 2, 3)
y <- c(2, 2, 2)
mean <- (x + y)/2
df <- data.frame(date = date, x = x, y = y)
newdata <- ifelse((df$date < as.Date('2013-05-02') | df$date > as.Date('2014-04-09')), mean, x)

newdata

I changed the dates in the condition to be more selective and I got 1.5 2.0 3.0. It selects the first value from mean and the others from x which agrees with the condition I used in the ifelse().

Date comparison R

Instead of ifelse use fifelse

library(data.table)
dt[, date_aux := fifelse(date_hist>max(date_hist),ymd(as.Date(max(date_hist),
format = "%Y-%m-%d")),ymd(as.Date(date_hist,format = "%Y-%m-%d")))]

str(dt)
#Classes ‘data.table’ and 'data.frame': 24 obs. of 2 variables:
# $ date_hist: Date, format: "2018-01-01" "2018-02-01" "2018-03-01" "2018-04-01" ...
# $ date_aux : Date, format: "2018-01-01" "2018-02-01" "2018-03-01" "2018-04-01" ...

with ifelse, the dates are converted to its storage mode i.e. numeric

If we check the source code, it is at the last few lines of assignment that creates the issue

...
ans <- test
len <- length(ans)
ypos <- which(test)
npos <- which(!test)
if (length(ypos) > 0L)
ans[ypos] <- rep(yes, length.out = len)[ypos]
if (length(npos) > 0L)
ans[npos] <- rep(no, length.out = len)[npos]
ans
...

With a simple example

v1 <- Sys.Date() -  1:5
v2 <- Sys.Date() + 1:5
ans <- v1 > Sys.Date() - 2 # logical vector
ypos <- which(ans)
npos <- which(!ans)
ans[ypos] <- rep(v2, length.out = length(ans))[ypos] Datess
ans
#[1] 18335 0 0 0 0

assigning Date class on the logical vector coerces the Date to convert to numeric

Compare dates in a dataframe column with a single date

There are vectorised functions available in R to do this instead of using sapply. In this case, you can use pmax -

df$date <- as.Date(df$date)
compare_date=as.Date("2018-08-02")
df$date <- pmax(df$date, compare_date)
df

# date
#1 2018-08-02
#2 2018-08-02
#3 2018-08-02
#4 2018-08-03

data

df <- structure(list(date = c("2018-07-31", "2018-08-01", "2018-08-02", 
"2018-08-03")), class = "data.frame", row.names = c(NA, -4L))


Related Topics



Leave a reply



Submit