Is R superstitious regarding POSIXct data type
The problem is that there was no such time as "2016-03-13 02:56:16". 13 Mar 2016 was when Daylight Savings Time started. At 2AM, that day, the clock jumped immediately to 3AM.
Possible bug in `difftime` - calculating difference in date time in R
You gained an hour due to Daylight Savings Time changeover (Sunday 10/30/2005 02:00:00)
You can modify this by doing as.POSIXct(..., tz = 'UTC')
with whatever timezone it's supposed to be; UTC to make things unambiguous and avoid DST changes.
If you want to modify the default timezone for all as.POSIXct()
calls, see How to change the default time zone in R?, which suggests:
- [as an R command]
Sys.setenv(TZ='GMT')
or - [R setup file] edit
TZ="UTC"
intoRenviron.site
Need an effecient way for bucketing date difference
Changing from a loop to iterating with a functional like lapply()
or map()
won’t make your code significantly faster. Those functions still do a loop under the hood; they just take care of some of the boilerplate code that you need to store the result.
The way to improve performance by orders of magnitude here, is to re-write FunTenor()
to work with a vector argument rather than a scalar one. Here’s one way to do that:
tenor <- function(x) {
months <- year(x) * 12 + month(x)
ifelse(months == 0,
as.character(cut(day(x),
breaks = c(-Inf, 1, 7, 14, Inf),
labels = c("1D", "7D", "14D", "1M"))),
as.character(cut(months,
breaks = c(-Inf, 2, 3, 6, 12, 36, Inf),
labels = c("2M", "3M", "6M", "1Y", "3Y", "5Y")))
)
}
And here’s a benchmark with 10 000 periods to show the difference:
library(microbenchmark)
library(lubridate)
library(purrr)
Dates <- data.frame(VAL_DATE = c("2015-07-27", "2015-09-15", "2016-06-24", "2016-06-23", "2015-09-17", "2015-06-22"), MAT_DATE = c("2016-07-27", "2016-09-15", "2016-08-08", "2017-06-23", "2016-09-16", "2017-06-22"))
dtDiff <- as.period(interval(ymd(Dates$VAL_DATE), ymd(Dates$MAT_DATE)))
FunTenor <- function(x) {
if (x@year * 12 + x@month == 0) (if (x@day <= 1) "1D" else if (x@day <= 7) "7D" else if (x@day <= 14) "14D" else "1M") else if ((x@year * 12 + x@month) <= 2) "2M" else if ((x@year * 12 + x@month) <= 3) "3M" else if ((x@year * 12 + x@month) <= 6) "6M" else if ((x@year * 12 + x@month) <= 12) "1Y" else if ((x@year * 12 + x@month) <= 36) "3Y" else "5Y"
}
set.seed(42)
x <- dtDiff[sample(length(dtDiff), 10000, replace = TRUE)]
print(microbenchmark(map_chr(x, FunTenor), tenor(x), times = 2), digits = 2)
#> Unit: milliseconds
#> expr min lq mean median uq max neval cld
#> map_chr(x, FunTenor) 4641.5 4641.5 4662.8 4662.8 4684.1 4684.1 2 b
#> tenor(x) 4.4 4.4 6.5 6.5 8.5 8.5 2 a
Created on 2019-07-17 by the reprex package (v0.3.0.9000)
Related Topics
Azure Put Blob Authentication Fails in R
How Many Non-Na Values in Each Row for a Matrix
How to Calculate the 95% Confidence Interval for the Slope in a Linear Regression Model in R
How Does the Removesparseterms in R Work
Remove a Layer from a Ggplot2 Chart
Grid Line Consistent with Ticks on Axis
Unnesting a List of Lists in a Data Frame Column
Quick/Elegant Way to Construct Mean/Variance Summary Table
Converting Ts Object to Data.Frame
Ordering Permutation in Rcpp I.E. Base::Order()
Writing R Function with If Enviornment
Evaluate (I.E., Predict) a Smoothing Spline Outside R
Lapply Function /Loops on List of Lists R
Kruskal-Wallis Test with Details on Pairwise Comparisons
Convert Data from Many Rows to Many Columns
Avoid That Space in Column Name Is Replaced with Period (".") When Using Read.Csv()