Is R Superstitious Regarding Posixct Data Type

Is R superstitious regarding POSIXct data type

The problem is that there was no such time as "2016-03-13 02:56:16". 13 Mar 2016 was when Daylight Savings Time started. At 2AM, that day, the clock jumped immediately to 3AM.

Possible bug in `difftime` - calculating difference in date time in R

You gained an hour due to Daylight Savings Time changeover (Sunday 10/30/2005 02:00:00)

You can modify this by doing as.POSIXct(..., tz = 'UTC') with whatever timezone it's supposed to be; UTC to make things unambiguous and avoid DST changes.

If you want to modify the default timezone for all as.POSIXct() calls, see How to change the default time zone in R?, which suggests:

  • [as an R command] Sys.setenv(TZ='GMT') or
  • [R setup file] edit TZ="UTC"into Renviron.site

Need an effecient way for bucketing date difference

Changing from a loop to iterating with a functional like lapply() or map() won’t make your code significantly faster. Those functions still do a loop under the hood; they just take care of some of the boilerplate code that you need to store the result.

The way to improve performance by orders of magnitude here, is to re-write FunTenor() to work with a vector argument rather than a scalar one. Here’s one way to do that:

tenor <- function(x) {
months <- year(x) * 12 + month(x)

ifelse(months == 0,
as.character(cut(day(x),
breaks = c(-Inf, 1, 7, 14, Inf),
labels = c("1D", "7D", "14D", "1M"))),
as.character(cut(months,
breaks = c(-Inf, 2, 3, 6, 12, 36, Inf),
labels = c("2M", "3M", "6M", "1Y", "3Y", "5Y")))
)
}

And here’s a benchmark with 10 000 periods to show the difference:

library(microbenchmark)
library(lubridate)
library(purrr)

Dates <- data.frame(VAL_DATE = c("2015-07-27", "2015-09-15", "2016-06-24", "2016-06-23", "2015-09-17", "2015-06-22"), MAT_DATE = c("2016-07-27", "2016-09-15", "2016-08-08", "2017-06-23", "2016-09-16", "2017-06-22"))

dtDiff <- as.period(interval(ymd(Dates$VAL_DATE), ymd(Dates$MAT_DATE)))

FunTenor <- function(x) {
if (x@year * 12 + x@month == 0) (if (x@day <= 1) "1D" else if (x@day <= 7) "7D" else if (x@day <= 14) "14D" else "1M") else if ((x@year * 12 + x@month) <= 2) "2M" else if ((x@year * 12 + x@month) <= 3) "3M" else if ((x@year * 12 + x@month) <= 6) "6M" else if ((x@year * 12 + x@month) <= 12) "1Y" else if ((x@year * 12 + x@month) <= 36) "3Y" else "5Y"
}

set.seed(42)

x <- dtDiff[sample(length(dtDiff), 10000, replace = TRUE)]

print(microbenchmark(map_chr(x, FunTenor), tenor(x), times = 2), digits = 2)
#> Unit: milliseconds
#> expr min lq mean median uq max neval cld
#> map_chr(x, FunTenor) 4641.5 4641.5 4662.8 4662.8 4684.1 4684.1 2 b
#> tenor(x) 4.4 4.4 6.5 6.5 8.5 8.5 2 a

Created on 2019-07-17 by the reprex package (v0.3.0.9000)



Related Topics



Leave a reply



Submit