Unexpected date when converting POSIXct date-time to Date - timezone issue?
Using the setup in the Note at the end we can use any of these:
# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"
# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"
# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"
Note
We have assumed this setup.
Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")
R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"
as.POSIXct gives an unexpected timezone
This is because as.POSIXct.Date
doesn't pass ...
to .POSIXct
.
> as.POSIXct.Date
function (x, ...)
.POSIXct(unclass(x) * 86400)
<environment: namespace:base>
R as.POSIXct timezone issue
If you use as.POSIXct
without specifying the timezone, it assumes the times you enter are utc and it also assumes you want them converted into the local timezone. Why it makes these often false assumptions is beyond me...
Try as.POSIXct(..., tz=<enter your timezone>)
If you do not know your timezone, Sys.timezone(location = TRUE)
will tell you what your timezone is.
Date conversion from POSIXct to Date in R
The problem here is timezones - you can see you're in "HKT"
. Try:
as.Date(as.POSIXct("2013-01-01 07:00", 'GMT'))
[1] "2013-01-01"
From ?as.Date()
:
["
POSIXct
" is] converted to days by ignoring the time after midnight
in the representation of the time in specified timezone, default UTC
Trouble dealing with POSIXct timezones and truncating the time out of POSIXct objects
If you don't specify a timezone then R will use your system's locale as POSIXct objects must have a timezone. The difference between CEST and CET is that one is summertime and one is not. That means if you define a date during the part of the year defined as summertime then R will decide to use the summertime version of the timezone. If you want to set dates that don't use summertime versions then define them as GMT from the beginning.
formatString = "%Y-%m-%d %H:%M:%OS"
x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="GMT")
y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="GMT")
If you want to truncate out the time, don't use as.Date
on a POSIXct object since as.Date
is meant to convert character objects to Date objects (which aren't the same as POSIXct objects). If you want to truncate POSIXct objects with base R then you'll have to wrap either round
or trunc
in as.POSIXct
but I would recommend checking out the lubridate
package for dealing with dates and times (specifically POSIXct objects).
If you want to keep CET but never use CEST you can use a location that doesn't observe daylight savings. According to http://www.timeanddate.com/time/zones/cet your only options are Algeria and Tunisia. According to https://en.wikipedia.org/wiki/List_of_tz_database_time_zones the valid tz would be "Africa/Algiers". Therefore you could do
formatString = "%Y-%m-%d %H:%M:%OS"
x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="Africa/Algiers")
y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="Africa/Algiers")
and both x and y would be in CET.
One more thing about setting timezones. If you tell R you want a generic timezone then it won't override daylight savings settings. That's why setting attr(y, "tzone") <- "CET"
didn't have the desired result. If you did attr(y, "tzone") <- "Africa/Algiers"
then it would have worked as you expected. Do be careful with conversions though because when you change the timezone it will change the time to account for the new timezone. The package lubridate
has the function force_tz
which changes the timezone without changing the time for cases where the initial timezone setting was wrong but the time was right.
Converting character to POSIXct in R loses time zone
As we all know, time is a relative thing. Storing time as UTC
/GMT
or relative to UTC
/GMT
will make sure that daylight savings etc only come into play when you want them to, as per: Does UTC observe daylight saving time?
So, if:
x <- c("1956-05-25 14:30:00 CST","1956-06-05 16:30:00 CST", "1956-07-04 15:30:00 CST",
"1956-07-08 08:00:00 CST", "1956-08-19 12:00:00 CST","1956-12-23 00:50:00 CST")
You can find out that CST
is 6 hours behind UTC
/GMT
(as opposed to CDT
, which is daylight savings time and is 7 hours behind)
Therefore:
out <- as.POSIXct(x,tz="ETC/GMT+6")
will represent CST
without any daylight savings shift to CDT
.
That way when or if you convert to local central timezones, the proper CST
time will be returned without changing the actual data for daylight savings. (i.e. - when R prints CDT
, it is only shifting the display of the time forward an hour, but the underlying numerical data is not changed. The last case displays as expected when standard time kicks back in):
attr(out,"tzone") <- "America/Chicago"
out
#[1] "1956-05-25 15:30:00 CDT" "1956-06-05 17:30:00 CDT" "1956-07-04 16:30:00 CDT"
#[4] "1956-07-08 09:00:00 CDT" "1956-08-19 13:00:00 CDT" "1956-12-23 00:50:00 CST"
I.e. - for case 1, 15:30 CDT == 14:30 CST
- as originally specified, and when daylight savings stops, for case 6, 00:50 CST == 00:50 CST
as originally specified.
Comparing this final out
to the other answer, you can see there is an actual numerical time difference of one hour for all the daylight savings cases:
out - strptime(x, format="%Y-%m-%d %H:%M:%S", tz="America/Chicago")
#Time differences in secs
#[1] 3600 3600 3600 3600 3600 0
NA result in converting string to POSIXct date time in R
This might be related to Daylight savings in your local timezone, as.POSIXct
uses local timezone by default. Try to use timezone as UTC
.
as.POSIXct("20210328 02:00:00", format = "%Y%m%d %H:%M:%S", tz = 'UTC')
#[1] "2021-03-28 02:00:00 UTC"
You can also use lubridate::ymd_hms
which uses UTC timezone by default.
lubridate::ymd_hms("20210328 02:00:00")
R: strptime() and is.na () unexpected results
The problem is likely that all the times that return NA
do not exist in whatever timezone you're using, due to daylight saving time.
Check with the data source to determine the timezone the data were recorded in, then set the tz
argument to that value in your call to strptime
.
Related Topics
Tls V1.1/Tls V1.2 Support in Rcurl
Out of Order Text Labels on Stack Bar Plot (Ggplot)
Looping Over Combinations of Regression Model Terms
Why Does Nls Function Not Work in Ggplot2
Recode Multiple Columns Using Dplyr
Convert Jpg to Greyscale CSV Using R
R: Using "Microbenchmark" and Ggplot2 to Plot Runtimes
R Function That Uses Its Output as Its Own Input Repeatedly
Extracting HTML Table from a Website in R
Convert from N X M Matrix to Long Matrix in R
R Shiny - Ui.R Seems to Not Recognize a Dataframe Read by Server.R
Equivalent of Which in Scraping