How to iterate over list of Dates without coercion to numeric?
There are two issues here. One is whether the input gets coerced from Date
to numeric
. The other is whether the output gets coerced to numeric
.
Input
For loops coerce Date
inputs to numeric
, because as @DWin and @JoshuaUlrich point out, for
loops take vectors
, and Date
s are technically not vectors.
> for(d in dates) print(class(d))
[1] "numeric"
[1] "numeric"
On the other hand, lapply
and its simplifier offspring sapply
have no such restrictions.
> sapply( dates, function(day) class(day) )
[1] "Date" "Date"
Output
However! The output of class()
above is a character. If you try actually returning a date object, sapply
is not what you want.
lapply
does not coerce to a vector, but sapply
does:
> lapply( dates, identity )
[[1]]
[1] "2013-01-01"
[[2]]
[1] "2013-01-02"
> sapply( dates, identity )
[1] 15706 15707
That's because sapply
's simplification function coerces output to a vector.
Summary
So: If you have a Date
object and want to return a non-Date
object, you can use lapply
or sapply
. If you have a non-Date
object, and want to return a Date
object, you can use a for
loop or lapply
. If you have a Date
object and want to return a Date
object, use lapply
.
Resources for learning more
If you want to dig deeper into vectors, you can start with John Cook's notes, continue with the R Inferno, and continue with SDA.
How to avoid R converting dates to numeric automatically?
The for
loop coerces the sequence to vector, unless it is vector, list, or some other things. date
is not a vector, and there is no such thing as vector of dates. So you need as.list
to protect it from coercion to vector:
for (d in as.list(date)) print(d)
Print a date range in a loop with correctly formatted dates
Try:
for (i in as.list(seq(as.Date('2020-04-02'), as.Date('2020-04-30'), by = 'day'))) {
print(i)
}
I don't know why this is necessary, but if you run
for (i in Sys.Date()) {browser();print(i);}
# Called from: top level
# Browse[1]>
debug at #1: print(i)
# Browse[1]>
i
# [1] 18709
you'll see that i
is being converted to numeric in the for (.)
portion. The as.list
helps preserve that class.
Looping over a Date or POSIXct object results in a numeric iterator
?"for"
says that seq
(the part after in
) is "[A]n expression evaluating to a vector (including a list and an expression) or to a pairlist or 'NULL'".
So your Date
vector is being coerced to numeric
because Date
objects aren't strictly vectors:
is.vector(Sys.Date())
# [1] FALSE
is.vector(as.numeric(Sys.Date()))
# [1] TRUE
The same is true for POSIXct
vectors:
is.vector(Sys.time())
# [1] FALSE
is.vector(as.numeric(Sys.time()))
# [1] TRUE
Why i am getting error while I am looping dates?
Here are two ways :
- Change the date from number to date in loop.
for (i in Weekly_Close_Price$PRICE_DATE){
print(as.Date(i, origin = '1970-01-01'))
}
- Loop over the index.
for (i in seq_along(Weekly_Close_Price$PRICE_DATE)) {
print(Weekly_Close_Price$PRICE_DATE[i])
}
Munging dates in R
The does not preserve Date
class misfeature is an artefact of R itself, and how some base R functions are implemented. See e.g.
R> dates <- Sys.Date() + 0:2
R> for (d in dates) cat(d, "\n")
17532
17533
17534
R>
Essentially, the S3 class attributes gets dropped when you do certain vector operations:
R> as.vector(dates)
[1] 17532 17533 17534
R>
So my recommendation is to pick a good container type you like and stick with it to do the operations there. I like data.table a lot for this. A quick example:
R> suppressMessages(library(data.table))
R> dt <- data.table(date=Sys.Date()+0:2, other=Sys.Date() + cumsum(runif(3)*100))
R> dt[, diff:=other-date][]
date other diff
1: 2018-01-01 2018-03-30 88.88445 days
2: 2018-01-02 2018-06-09 158.23913 days
3: 2018-01-03 2018-07-30 208.62187 days
R> dt[, month:=month(other)][]
date other diff month
1: 2018-01-01 2018-03-30 88.88445 days 3
2: 2018-01-02 2018-06-09 158.23913 days 6
3: 2018-01-03 2018-07-30 208.62187 days 7
R>
Not only does the Date
type persist (as evidenced by the difference operation returning a difftime
object), but you also gets lots of helper
functions (like month()
) here. Grouping by date is also natural.
Transforming a list of dates into a dataframe in R
The output from the function is a vector
of Date
s. So, we can just wrap with data.frame
df1 <- data.frame(date = TradingDates(2010:2011))
str(df1)
#'data.frame': 504 obs. of 1 variable:
#$ date: Date, format: "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" ...
as.Date in for loop performing unexpectedly
A single atomic vector can only be of a single class
When you use [<-
to replace a single value of df.for
, R
can't hold those values you have not changed as "character" variables that look like Dates, and a Date class value (a number which is formated and displayed like a character). Therefore it coerces to character.
you could get around this by making df.for
a list
eg
df.for <- as.list(df.1)
for (i in seq_along(df.1)){
df.for[[i]] <- as.Date(df.1[i], format="%d-%b-%y")
}
Or by coercing the results back to a Date at the end of the loop (via numeric)
eg
df.for <- df.1
for (i in seq_along(df.1)){
df.for[i] <- as.Date(df.1[i], format="%d-%b-%y")
}
as.Date(as.numeric(df.for),origin = '1970-01-01')
Joining multiple time sequences together
try this:
do.call("c", mapply(seq, from, to, by = 60, SIMPLIFY = FALSE))
Passing Sequence of Dates to dataframe column
It is better to initialize the 'DF' as
DF_1 <- data.frame(days)
str(DF_1)
'data.frame': 2006 obs. of 1 variable:
$ days: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...
Or if we still want to use a for
loop, initialize with Date
class instead of logical
(matrix
creates the NA
row which are logical)
DF_1 <- data.frame(col1 = as.Date(rep(NA, length(days))))
Now, if we do the loop
for (i in 1:length(days)) {
print(days[i])
DF_1[i,1] <- days[i]
}
checking the class
str(DF_1)
'data.frame': 2006 obs. of 1 variable:
$ col1: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...
The issue is the coercion of Date
to its integer storage values. We can find the behavior also when unlist
unlist(as.list(head(days)))
[1] 16801 16802 16803 16804 16805 16806
or with unclass
unclass(head(days))
[1] 16801 16802 16803 16804 16805 16806
which can be corrected with c
in do.call
if the input is a list
do.call(c, as.list(head(days)))
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
or convert the integer back to Date
class afterwards by specifying the origin
in as.Date
as.Date(unlist(as.list(head(days))), origin = '1970-01-01')
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
Related Topics
Implementation of Parallel Coordinates
Adding Elements to a List in for Loop in R
Group by in R, Ddply with Weighted.Mean
How to Pass Strings Denoting Expressions to Dplyr 0.7 Verbs
Apply Function to Each Column in a Data Frame Observing Each Columns Existing Data Type
How to Display Verbatim Inline R Code with Backticks Using Rmarkdown
How to Generate Ascii "Graphical Output" from R
Include Data Examples in Developing R Packages
Could Not Find Function Inside Foreach Loop
How to Get Factor Matrices in R
Ggplot: Multiple Years on Same Plot by Month
Check If Each Row of a Data Frame Is Contained in Another Data Frame
Coding Practice in R:What Are the Advantages and Disadvantages of Different Styles
R How to Convert a Numeric into Factor with Predefined Labels
Remove Empty Elements from List with Character(0)
Sources on S4 Objects, Methods and Programming in R
Adjusting Width of Tables Made with Kable() in Rmarkdown Documents