How to Iterate Over List of Dates Without Coercion to Numeric

How to iterate over list of Dates without coercion to numeric?

There are two issues here. One is whether the input gets coerced from Date to numeric. The other is whether the output gets coerced to numeric.

Input

For loops coerce Date inputs to numeric, because as @DWin and @JoshuaUlrich point out, for loops take vectors, and Dates are technically not vectors.

> for(d in dates) print(class(d))
[1] "numeric"
[1] "numeric"

On the other hand, lapply and its simplifier offspring sapply have no such restrictions.

> sapply( dates, function(day) class(day) )
[1] "Date" "Date"

Output

However! The output of class() above is a character. If you try actually returning a date object, sapply is not what you want.

lapply does not coerce to a vector, but sapply does:

> lapply( dates, identity )
[[1]]
[1] "2013-01-01"

[[2]]
[1] "2013-01-02"

> sapply( dates, identity )
[1] 15706 15707

That's because sapply's simplification function coerces output to a vector.

Summary

So: If you have a Date object and want to return a non-Date object, you can use lapply or sapply. If you have a non-Date object, and want to return a Date object, you can use a for loop or lapply. If you have a Date object and want to return a Date object, use lapply.

Resources for learning more

If you want to dig deeper into vectors, you can start with John Cook's notes, continue with the R Inferno, and continue with SDA.

How to avoid R converting dates to numeric automatically?

The for loop coerces the sequence to vector, unless it is vector, list, or some other things. date is not a vector, and there is no such thing as vector of dates. So you need as.list to protect it from coercion to vector:

for (d in as.list(date)) print(d)

Print a date range in a loop with correctly formatted dates

Try:

for (i in as.list(seq(as.Date('2020-04-02'), as.Date('2020-04-30'), by = 'day'))) {
print(i)
}

I don't know why this is necessary, but if you run

for (i in Sys.Date()) {browser();print(i);}
# Called from: top level
# Browse[1]>
debug at #1: print(i)
# Browse[1]>
i
# [1] 18709

you'll see that i is being converted to numeric in the for (.) portion. The as.list helps preserve that class.

Looping over a Date or POSIXct object results in a numeric iterator

?"for" says that seq (the part after in) is "[A]n expression evaluating to a vector (including a list and an expression) or to a pairlist or 'NULL'".

So your Date vector is being coerced to numeric because Date objects aren't strictly vectors:

is.vector(Sys.Date())
# [1] FALSE
is.vector(as.numeric(Sys.Date()))
# [1] TRUE

The same is true for POSIXct vectors:

is.vector(Sys.time())
# [1] FALSE
is.vector(as.numeric(Sys.time()))
# [1] TRUE

Why i am getting error while I am looping dates?

Here are two ways :

  1. Change the date from number to date in loop.
for (i in Weekly_Close_Price$PRICE_DATE){
print(as.Date(i, origin = '1970-01-01'))
}

  1. Loop over the index.
for (i in seq_along(Weekly_Close_Price$PRICE_DATE)) {
print(Weekly_Close_Price$PRICE_DATE[i])
}

Munging dates in R

The does not preserve Date class misfeature is an artefact of R itself, and how some base R functions are implemented. See e.g.

R> dates <- Sys.Date() + 0:2
R> for (d in dates) cat(d, "\n")
17532
17533
17534
R>

Essentially, the S3 class attributes gets dropped when you do certain vector operations:

R> as.vector(dates)
[1] 17532 17533 17534
R>

So my recommendation is to pick a good container type you like and stick with it to do the operations there. I like data.table a lot for this. A quick example:

R> suppressMessages(library(data.table))
R> dt <- data.table(date=Sys.Date()+0:2, other=Sys.Date() + cumsum(runif(3)*100))
R> dt[, diff:=other-date][]
date other diff
1: 2018-01-01 2018-03-30 88.88445 days
2: 2018-01-02 2018-06-09 158.23913 days
3: 2018-01-03 2018-07-30 208.62187 days
R> dt[, month:=month(other)][]
date other diff month
1: 2018-01-01 2018-03-30 88.88445 days 3
2: 2018-01-02 2018-06-09 158.23913 days 6
3: 2018-01-03 2018-07-30 208.62187 days 7
R>

Not only does the Date type persist (as evidenced by the difference operation returning a difftime object), but you also gets lots of helper
functions (like month()) here. Grouping by date is also natural.

Transforming a list of dates into a dataframe in R

The output from the function is a vector of Dates. So, we can just wrap with data.frame

df1 <-  data.frame(date = TradingDates(2010:2011))
str(df1)
#'data.frame': 504 obs. of 1 variable:
#$ date: Date, format: "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" ...

as.Date in for loop performing unexpectedly

A single atomic vector can only be of a single class

When you use [<- to replace a single value of df.for, R can't hold those values you have not changed as "character" variables that look like Dates, and a Date class value (a number which is formated and displayed like a character). Therefore it coerces to character.

you could get around this by making df.for a list

eg

df.for <- as.list(df.1)
for (i in seq_along(df.1)){
df.for[[i]] <- as.Date(df.1[i], format="%d-%b-%y")
}

Or by coercing the results back to a Date at the end of the loop (via numeric)

eg

df.for <- df.1
for (i in seq_along(df.1)){
df.for[i] <- as.Date(df.1[i], format="%d-%b-%y")
}

as.Date(as.numeric(df.for),origin = '1970-01-01')

Joining multiple time sequences together

try this:

do.call("c", mapply(seq, from, to, by = 60, SIMPLIFY = FALSE))

Passing Sequence of Dates to dataframe column

It is better to initialize the 'DF' as

DF_1 <- data.frame(days)
str(DF_1)
'data.frame': 2006 obs. of 1 variable:
$ days: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...

Or if we still want to use a for loop, initialize with Date class instead of logical (matrix creates the NA row which are logical)

DF_1 <- data.frame(col1 = as.Date(rep(NA, length(days))))

Now, if we do the loop

for (i in 1:length(days)) {
print(days[i])
DF_1[i,1] <- days[i]
}

checking the class

str(DF_1)
'data.frame': 2006 obs. of 1 variable:
$ col1: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...

The issue is the coercion of Date to its integer storage values. We can find the behavior also when unlist

unlist(as.list(head(days)))
[1] 16801 16802 16803 16804 16805 16806

or with unclass

unclass(head(days))
[1] 16801 16802 16803 16804 16805 16806

which can be corrected with c in do.call if the input is a list

do.call(c, as.list(head(days)))
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"

or convert the integer back to Date class afterwards by specifying the origin in as.Date

as.Date(unlist(as.list(head(days))), origin = '1970-01-01')
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"


Related Topics



Leave a reply



Submit