As.Date(As.Posixct()) Gives the Wrong Date

as.Date(as.POSIXct()) gives the wrong date?

The safe way to do this is to pass the date value through format. This does create an additional step but as.Date will accept the character result if it is formated with a "-" or "/":

as.Date( format( as.POSIXct('2019-03-11 23:59:59'), "%Y-%m-%d") )
[1] "2019-03-11"

as.Date( as.POSIXct('2019-03-11 23:59:59') ) # I'm in a locale where the problem might exist
[1] "2019-03-12"

The documentation for timezones is confusing to me too. In some (and this case as it turned out) case EST may not be unambiguous and may actually refer to a tz in Australia. Try "EST5EDT" or "America/New_York" if you happen to be in North America.

In this case it could also relate to differences in how your unstated OS handles the 'tz' argument, since I get "2012-08-06". ( I'm in PDT US tz at the moment, although I'm not sure that should matter. )Changing which function gets the tz argument may clarify (or not):

> as.Date(as.POSIXct('2012-08-06 19:35:23', tz='EST'))
[1] "2012-08-07"
> as.Date(as.POSIXct('2012-08-06 17:35:23', tz='EST'))
[1] "2012-08-06"

> as.Date(as.POSIXct('2012-08-06 21:35:23'), tz='EST')
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 22:35:23'), tz='EST')
[1] "2012-08-07"

If you omit the tz from as.POSIXct then UTC is assumed.

These are the unambiguous names of the Ozzie TZ's (at least on my Mac):

tzfile <- "/usr/share/zoneinfo/zone.tab"
tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
col.names = c("country", "coords", "name", "comments"),
as.is = TRUE, fill = TRUE, comment.char = "#")
grep("^Aus", tzones$name, value=TRUE)
[1] "Australia/Lord_Howe" "Australia/Hobart"
[3] "Australia/Currie" "Australia/Melbourne"
[5] "Australia/Sydney" "Australia/Broken_Hill"
[7] "Australia/Brisbane" "Australia/Lindeman"
[9] "Australia/Adelaide" "Australia/Darwin"
[11] "Australia/Perth" "Australia/Eucla"

Trouble with date format using the function as.POSIXct in R

You can include the characters in your format string:

d <- "2014-02-05T08:45:01.326Z"
timestamp <- strptime(d, tz = "UTC", "%Y-%m-%dT%H:%M:%OSZ")

Note that here %OS is used instead of %S because you have fractional seconds.

`as.POSIXct` get error with ` %Y-%m-%d %H:%M:%S ` format

The easy ones first:

  • optional = FALSE is the default: therefore #1 == #2 and #4 == #5
  • #6 needs no explanation: you need the argument origin = as the error states
  • #3 returns different results because of the time zone (the tz= argument). Therefore, it shows 8 hours before.

Now, the problem is #4 and #5 (which are the same as I stated before):

as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] NA NA NA NA NA NA NA NA

To understand how this works you need to look at the function as.POSIXct, which, when called with a numeric x (like in this case), calls the method: as.POSIXct.numeric.

as.POSIXct.numeric

#> function (x, tz = "", origin, ...)
#> {
#> if (missing(origin)) {
#> if (!length(x))
#> return(.POSIXct(numeric(), tz))
#> if (!any(is.finite(x)))
#> return(.POSIXct(x, tz))
#> stop("'origin' must be supplied")
#> }
#> .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)
#> }
#> <bytecode: 0x55df7f23b390>
#> <environment: namespace:base>

Focus on this line:

#> .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)

In particular:

as.POSIXct(origin, tz = "GMT", ...) + x

As you see, the function transforms origin in datetime and then it sums the numeric input you imputed. Every additional argument you provided falls into ....

The function tries to convert 1970-01-01 to datetime using the format you provided: %Y-%m-%d %H:%M:%S.
Since the origin 1970-01-01 has format %Y-%m-%d, the function can't convert the origin from string to POSIX, thus returning NA. (That's where NAs are generated!)

When you convert a numeric to POSIX, the format you add as argument doens't apply to the output (since it will be always a POSIX) nor to the input, rather to the origin. Thus, origin and format need to match.

To solve your problem, you need to use origin with the format %Y-%m-%d %H:%M:%S.
Like this:

as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01 00:00:00")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"

Or you need to use this format: %Y-%m-%d
Like this:

as.POSIXct(dates,"%Y-%m-%d",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"

The results are then equal to #1 and #2.

strptime, as.POSIXct and as.Date return unexpected NA

I think it is exactly as you guessed, strptime fails to parse your date-time string because of your locales. Your string contains both abbreviated weekday (%a) and abbreviated month name (%b). These time specifications are described in ?strptime:

Details

%a: Abbreviated weekday name in the current locale on this
platform

%b: Abbreviated month name in the current locale on this platform.

"Note that abbreviated names are platform-specific (although the
standards specify that in the C locale they must be the first three
letters of the capitalized English name:"

"Knowing what the abbreviations are is essential if you wish to use
%a, %b or %h as part of an input format: see the examples for
how to check."

See also

[...] locales to query or set a locale.

The issue of locales is relevant also for as.POSIXct, as.POSIXlt and as.Date.

From ?as.POSIXct:

Details

If format is specified, remember that some of the format
specifications are locale-specific, and you may need to set the
LC_TIME category appropriately via Sys.setlocale. This most often
affects the use of %b, %B (month names) and %p (AM/PM).

From ?as.Date:

Details

Locale-specific conversions to and from character strings are used
where appropriate and available. This affects the names of the days
and months.


Thus, if weekdays and month names in the string differ from those in the current locale, strptime, as.POSIXct and as.Date fail to parse the string correctly and NA is returned.

However, you may solve this issue by changing the locales:

# First save your current locale
loc <- Sys.getlocale("LC_TIME")

# Set correct locale for the strings to be parsed
# (in this particular case: English)
# so that weekdays (e.g "Thu") and abbreviated month (e.g "Nov") are recognized
Sys.setlocale("LC_TIME", "en_GB.UTF-8")
# or
Sys.setlocale("LC_TIME", "C")

#Then proceed as you intended
x <- "Thu Nov 8 15:41:45 2012"
strptime(x, "%a %b %d %H:%M:%S %Y")
# [1] "2012-11-08 15:41:45"

# Then set back to your old locale
Sys.setlocale("LC_TIME", loc)

With my personal locale I can reproduce your error:

Sys.setlocale("LC_TIME", loc)
# [1] "fr_FR.UTF-8"

strptime(var,"%a %b %d %H:%M:%S %Y")
# [1] NA

Unexpected date when converting POSIXct date-time to Date - timezone issue?

Using the setup in the Note at the end we can use any of these:

# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"

# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"

# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"

# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"

# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"

Note

We have assumed this setup.

Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")

R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"

`as.Date()` returns `NA` instead of Date object for string 2012-01

Dates need to have a year, month, and day component. Assuming you were OK with the first of the month representing a given month, you could use:

x <- "2012-01"
d <- paste0(x, "-01")
as.Date(d, format="%Y-%m-%d")

can't use mutate with a DATE

The error is telling you the reason :
The column you "want to mutate" is character type "POS" if TRUE but if FALSE you are providing date. thats why if you give a character it works.

1st idea that worked for me: if you don't find it useful, please let me know.

 base_contactos_3%>%
mutate(positivo= if_else(hpv_post_res=="POS", as.character(hpv_post), as.character(0)))

gives the following result.
Since you are merging date and character as choices for TRUE/FALSE, you can have only 1 in resulting column. (Since time is 0 only so only dates are taken)

# A tibble: 6 x 4
id_mujer hpv_post hpv_post_res positivo
<dbl> <dttm> <chr> <chr>
1 8528 2012-06-12 00:00:00 NEG 0
2 8528 2016-03-17 00:00:00 NEG 0
3 11711 2015-09-30 00:00:00 POS 2015-09-30
4 11711 2015-09-30 00:00:00 POS 2015-09-30
5 11818 2012-12-07 00:00:00 NEG 0
6 11818 2018-05-04 00:00:00 NEG 0


Related Topics



Leave a reply



Submit