as.Date(as.POSIXct()) gives the wrong date?
The safe way to do this is to pass the date value through format
. This does create an additional step but as.Date
will accept the character result if it is formated with a "-" or "/":
as.Date( format( as.POSIXct('2019-03-11 23:59:59'), "%Y-%m-%d") )
[1] "2019-03-11"
as.Date( as.POSIXct('2019-03-11 23:59:59') ) # I'm in a locale where the problem might exist
[1] "2019-03-12"
The documentation for timezones is confusing to me too. In some (and this case as it turned out) case EST may not be unambiguous and may actually refer to a tz in Australia. Try "EST5EDT" or "America/New_York" if you happen to be in North America.
In this case it could also relate to differences in how your unstated OS handles the 'tz' argument, since I get "2012-08-06". ( I'm in PDT US tz at the moment, although I'm not sure that should matter. )Changing which function gets the tz argument may clarify (or not):
> as.Date(as.POSIXct('2012-08-06 19:35:23', tz='EST'))
[1] "2012-08-07"
> as.Date(as.POSIXct('2012-08-06 17:35:23', tz='EST'))
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 21:35:23'), tz='EST')
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 22:35:23'), tz='EST')
[1] "2012-08-07"
If you omit the tz from as.POSIXct
then UTC is assumed.
These are the unambiguous names of the Ozzie TZ's (at least on my Mac):
tzfile <- "/usr/share/zoneinfo/zone.tab"
tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
col.names = c("country", "coords", "name", "comments"),
as.is = TRUE, fill = TRUE, comment.char = "#")
grep("^Aus", tzones$name, value=TRUE)
[1] "Australia/Lord_Howe" "Australia/Hobart"
[3] "Australia/Currie" "Australia/Melbourne"
[5] "Australia/Sydney" "Australia/Broken_Hill"
[7] "Australia/Brisbane" "Australia/Lindeman"
[9] "Australia/Adelaide" "Australia/Darwin"
[11] "Australia/Perth" "Australia/Eucla"
Trouble with date format using the function as.POSIXct in R
You can include the characters in your format string:
d <- "2014-02-05T08:45:01.326Z"
timestamp <- strptime(d, tz = "UTC", "%Y-%m-%dT%H:%M:%OSZ")
Note that here %OS
is used instead of %S
because you have fractional seconds.
`as.POSIXct` get error with ` %Y-%m-%d %H:%M:%S ` format
The easy ones first:
optional = FALSE
is the default: therefore #1 == #2 and #4 == #5- #6 needs no explanation: you need the argument
origin =
as the error states - #3 returns different results because of the time zone (the
tz=
argument). Therefore, it shows 8 hours before.
Now, the problem is #4 and #5 (which are the same as I stated before):
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] NA NA NA NA NA NA NA NA
To understand how this works you need to look at the function as.POSIXct
, which, when called with a numeric x
(like in this case), calls the method: as.POSIXct.numeric
.
as.POSIXct.numeric
#> function (x, tz = "", origin, ...)
#> {
#> if (missing(origin)) {
#> if (!length(x))
#> return(.POSIXct(numeric(), tz))
#> if (!any(is.finite(x)))
#> return(.POSIXct(x, tz))
#> stop("'origin' must be supplied")
#> }
#> .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)
#> }
#> <bytecode: 0x55df7f23b390>
#> <environment: namespace:base>
Focus on this line:
#> .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)
In particular:
as.POSIXct(origin, tz = "GMT", ...) + x
As you see, the function transforms origin
in datetime and then it sums the numeric input you imputed. Every additional argument you provided falls into ...
.
The function tries to convert 1970-01-01
to datetime using the format you provided: %Y-%m-%d %H:%M:%S
.
Since the origin 1970-01-01
has format %Y-%m-%d
, the function can't convert the origin from string to POSIX, thus returning NA
. (That's where NA
s are generated!)
When you convert a numeric to POSIX, the format you add as argument doens't apply to the output (since it will be always a POSIX) nor to the input, rather to the origin
. Thus, origin
and format
need to match.
To solve your problem, you need to use origin
with the format %Y-%m-%d %H:%M:%S
.
Like this:
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01 00:00:00")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"
Or you need to use this format: %Y-%m-%d
Like this:
as.POSIXct(dates,"%Y-%m-%d",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"
The results are then equal to #1 and #2.
strptime, as.POSIXct and as.Date return unexpected NA
I think it is exactly as you guessed, strptime
fails to parse your date-time string because of your locales. Your string contains both abbreviated weekday (%a
) and abbreviated month name (%b
). These time specifications are described in ?strptime
:
Details
%a
: Abbreviated weekday name in the current locale on this
platform
%b
: Abbreviated month name in the current locale on this platform."Note that abbreviated names are platform-specific (although the
standards specify that in theC
locale they must be the first three
letters of the capitalized English name:""Knowing what the abbreviations are is essential if you wish to use
%a
,%b
or%h
as part of an input format: see the examples for
how to check."See also
[...]
locales
to query or set a locale.
The issue of locales
is relevant also for as.POSIXct
, as.POSIXlt
and as.Date
.
From ?as.POSIXct
:
Details
If
format
is specified, remember that some of the format
specifications are locale-specific, and you may need to set the
LC_TIME
category appropriately viaSys.setlocale
. This most often
affects the use of%b
,%B
(month names) and%p
(AM/PM).
From ?as.Date
:
Details
Locale-specific conversions to and from character strings are used
where appropriate and available. This affects the names of the days
and months.
Thus, if weekdays and month names in the string differ from those in the current locale, strptime
, as.POSIXct
and as.Date
fail to parse the string correctly and NA
is returned.
However, you may solve this issue by changing the locales
:
# First save your current locale
loc <- Sys.getlocale("LC_TIME")
# Set correct locale for the strings to be parsed
# (in this particular case: English)
# so that weekdays (e.g "Thu") and abbreviated month (e.g "Nov") are recognized
Sys.setlocale("LC_TIME", "en_GB.UTF-8")
# or
Sys.setlocale("LC_TIME", "C")
#Then proceed as you intended
x <- "Thu Nov 8 15:41:45 2012"
strptime(x, "%a %b %d %H:%M:%S %Y")
# [1] "2012-11-08 15:41:45"
# Then set back to your old locale
Sys.setlocale("LC_TIME", loc)
With my personal locale I can reproduce your error:
Sys.setlocale("LC_TIME", loc)
# [1] "fr_FR.UTF-8"
strptime(var,"%a %b %d %H:%M:%S %Y")
# [1] NA
Unexpected date when converting POSIXct date-time to Date - timezone issue?
Using the setup in the Note at the end we can use any of these:
# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"
# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"
# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"
Note
We have assumed this setup.
Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")
R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"
`as.Date()` returns `NA` instead of Date object for string 2012-01
Dates need to have a year, month, and day component. Assuming you were OK with the first of the month representing a given month, you could use:
x <- "2012-01"
d <- paste0(x, "-01")
as.Date(d, format="%Y-%m-%d")
can't use mutate with a DATE
The error is telling you the reason :
The column you "want to mutate" is character type "POS" if TRUE but if FALSE you are providing date. thats why if you give a character it works.
1st idea that worked for me: if you don't find it useful, please let me know.
base_contactos_3%>%
mutate(positivo= if_else(hpv_post_res=="POS", as.character(hpv_post), as.character(0)))
gives the following result.
Since you are merging date and character as choices for TRUE/FALSE, you can have only 1 in resulting column. (Since time is 0 only so only dates are taken)
# A tibble: 6 x 4
id_mujer hpv_post hpv_post_res positivo
<dbl> <dttm> <chr> <chr>
1 8528 2012-06-12 00:00:00 NEG 0
2 8528 2016-03-17 00:00:00 NEG 0
3 11711 2015-09-30 00:00:00 POS 2015-09-30
4 11711 2015-09-30 00:00:00 POS 2015-09-30
5 11818 2012-12-07 00:00:00 NEG 0
6 11818 2018-05-04 00:00:00 NEG 0
Related Topics
Finding Non-Numeric Data in a Data Frame or Vector
Dplyr Group by Colnames Described as Vector of Strings
Expression and New Line in Plot Labels
How to Use Loess Method in Ggally::Ggpairs Using Wrap Function
Grouping Every N Minutes with Dplyr
Add Text on Right of Shinydashboard Header
Update a Column of Nas in One Data Table with the Value from a Column in Another Data Table
How to Add a Condition to the Geom_Point Size
Create Url Hyperlink in R Shiny
Generate All Possible Permutations (Or N-Tuples)
Increase Space Between Bars in Ggplot
Change the Number of Breaks Using Facet_Grid in Ggplot2
Represent Numeric Value with Typical Dollar Amount Format
How to Have Na's Displayed First Using Arrange()
Canonical Tidyverse Method to Update Some Values of a Vector from a Look-Up Table
How to View an HTML Table in the Viewer Pane
Two Y-Axes with Different Scales for Two Datasets in Ggplot2