Unexpected Date When Converting Posixct Date-Time to Date - Timezone Issue

Unexpected date when converting POSIXct date-time to Date - timezone issue?

Using the setup in the Note at the end we can use any of these:

# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"

# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"

# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"

# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"

# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"

Note

We have assumed this setup.

Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")

R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"

as.POSIXct gives an unexpected timezone

This is because as.POSIXct.Date doesn't pass ... to .POSIXct.

> as.POSIXct.Date
function (x, ...)
.POSIXct(unclass(x) * 86400)
<environment: namespace:base>

R as.POSIXct timezone issue

If you use as.POSIXct without specifying the timezone, it assumes the times you enter are utc and it also assumes you want them converted into the local timezone. Why it makes these often false assumptions is beyond me...

Try as.POSIXct(..., tz=<enter your timezone>)

If you do not know your timezone, Sys.timezone(location = TRUE) will tell you what your timezone is.

Date conversion from POSIXct to Date in R

The problem here is timezones - you can see you're in "HKT". Try:

as.Date(as.POSIXct("2013-01-01 07:00", 'GMT'))
[1] "2013-01-01"

From ?as.Date():

["POSIXct" is] converted to days by ignoring the time after midnight
in the representation of the time in specified timezone, default UTC

Trouble dealing with POSIXct timezones and truncating the time out of POSIXct objects

If you don't specify a timezone then R will use your system's locale as POSIXct objects must have a timezone. The difference between CEST and CET is that one is summertime and one is not. That means if you define a date during the part of the year defined as summertime then R will decide to use the summertime version of the timezone. If you want to set dates that don't use summertime versions then define them as GMT from the beginning.

formatString = "%Y-%m-%d %H:%M:%OS"
x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="GMT")
y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="GMT")

If you want to truncate out the time, don't use as.Date on a POSIXct object since as.Date is meant to convert character objects to Date objects (which aren't the same as POSIXct objects). If you want to truncate POSIXct objects with base R then you'll have to wrap either round or trunc in as.POSIXct but I would recommend checking out the lubridate package for dealing with dates and times (specifically POSIXct objects).

If you want to keep CET but never use CEST you can use a location that doesn't observe daylight savings. According to http://www.timeanddate.com/time/zones/cet your only options are Algeria and Tunisia. According to https://en.wikipedia.org/wiki/List_of_tz_database_time_zones the valid tz would be "Africa/Algiers". Therefore you could do

 formatString = "%Y-%m-%d %H:%M:%OS"
x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="Africa/Algiers")
y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="Africa/Algiers")

and both x and y would be in CET.

One more thing about setting timezones. If you tell R you want a generic timezone then it won't override daylight savings settings. That's why setting attr(y, "tzone") <- "CET" didn't have the desired result. If you did attr(y, "tzone") <- "Africa/Algiers" then it would have worked as you expected. Do be careful with conversions though because when you change the timezone it will change the time to account for the new timezone. The package lubridate has the function force_tz which changes the timezone without changing the time for cases where the initial timezone setting was wrong but the time was right.

Converting character to POSIXct in R loses time zone

As we all know, time is a relative thing. Storing time as UTC/GMT or relative to UTC/GMT will make sure that daylight savings etc only come into play when you want them to, as per: Does UTC observe daylight saving time?

So, if:

x <- c("1956-05-25 14:30:00 CST","1956-06-05 16:30:00 CST", "1956-07-04 15:30:00 CST",
"1956-07-08 08:00:00 CST", "1956-08-19 12:00:00 CST","1956-12-23 00:50:00 CST")

You can find out that CST is 6 hours behind UTC/GMT (as opposed to CDT, which is daylight savings time and is 7 hours behind)

Therefore:

out <- as.POSIXct(x,tz="ETC/GMT+6")

will represent CST without any daylight savings shift to CDT.
That way when or if you convert to local central timezones, the proper CST time will be returned without changing the actual data for daylight savings. (i.e. - when R prints CDT, it is only shifting the display of the time forward an hour, but the underlying numerical data is not changed. The last case displays as expected when standard time kicks back in):

attr(out,"tzone") <- "America/Chicago"
out
#[1] "1956-05-25 15:30:00 CDT" "1956-06-05 17:30:00 CDT" "1956-07-04 16:30:00 CDT"
#[4] "1956-07-08 09:00:00 CDT" "1956-08-19 13:00:00 CDT" "1956-12-23 00:50:00 CST"

I.e. - for case 1, 15:30 CDT == 14:30 CST - as originally specified, and when daylight savings stops, for case 6, 00:50 CST == 00:50 CST as originally specified.

Comparing this final out to the other answer, you can see there is an actual numerical time difference of one hour for all the daylight savings cases:

out - strptime(x, format="%Y-%m-%d %H:%M:%S", tz="America/Chicago")
#Time differences in secs
#[1] 3600 3600 3600 3600 3600 0

NA result in converting string to POSIXct date time in R

This might be related to Daylight savings in your local timezone, as.POSIXct uses local timezone by default. Try to use timezone as UTC.

as.POSIXct("20210328 02:00:00", format = "%Y%m%d %H:%M:%S", tz = 'UTC')
#[1] "2021-03-28 02:00:00 UTC"

You can also use lubridate::ymd_hms which uses UTC timezone by default.

lubridate::ymd_hms("20210328 02:00:00")

R: strptime() and is.na () unexpected results

The problem is likely that all the times that return NA do not exist in whatever timezone you're using, due to daylight saving time.

Check with the data source to determine the timezone the data were recorded in, then set the tz argument to that value in your call to strptime.



Related Topics



Leave a reply



Submit