Accurately converting from character- POSIXct- character with sub millisecond datetimes
Two things:
1) @statquant is right (and the otherwise known experts @Joshua Ulrich and @Dirk Eddelbuettel are wrong), and @Aaron in his comment, but that will not be important for the main question here:
POSIXlt
by design is definitely more accurate in storing times than POSIXct
: As its seconds are always in [0, 60), it has a granularity of about 6e-15, i.e., 6 femtoseconds which would be dozens of million times less granular than POSIXct
.
However, this is not very relevant here (and for current R): Almost all operations, notably numeric ones, use the Ops
group method (yes, not known to beginners, but well documented), just look at Ops.POSIXt
which indeed trashes the extra precision by first coercing to POSIXct
. In addition, the format()/print() ing uses 6 decimals after the "." at most, and hence also does not distinguish between the internally higher precision of POSIXlt
and the "only" 100 nanosecond granularity of POSIXct
.
(For the above reason, both Dirk and Joshua were lead to their wrong assertion: For all simple practical uses, the precision of *lt and *ct is made the same).
2) I do tend to agree that we (R Core) should improve the format()
ing and hence print()
ing of such fractions of seconds POSIXt objects (still after the bug fix mentioned by @Aaron above).
But then I may be wrong, and "we" have got it right, by some definition of "right" ;-)
From character to date in R
You do not need to eliminate the +0000
, it is the offset expressed in minutes from the UTC and some times (not this one which is 0000) it might be very useful and convert the data in the proper way.
Here is a working example with strptime
:
x <- "Fri Sep 18 17:01:33 +0000 2015"
strptime(x, "%a %b %d %H:%M:%S %z %Y", tz = "UTC")
[1] "2015-09-18 17:01:33 UTC"
Unexpected behavior with POSIXct datetimes under diff
There is a units<-
function for difftime objects:
> units(del) <- 'hours'
> table(del)
del
0 1
1 46
The ?difftime
help page says:
If units = "auto", a suitable set of units is chosen, the largest possible (excluding "weeks") in which all the absolute differences are greater than one.
So perhaps the logic of the function got sidetracked by the 0 value in your case and the units got set to seconds.
Odd behavor with POSIXct/POSIXlt and subsecond accuracy
@GSee is right, this is a floating point arithmetic problem. And Gavin Simpson's answer is correct in that it's how the object is printed.
R> options(digits=17)
R> .index(x)
[1] 1295589600.0009999 1295589600.0020001 1295589600.0030000 1295589600.0039999
[5] 1295589600.0050001 1295589600.0060000 1295589600.0070000 1295589600.0079999
[9] 1295589600.0090001 1295589600.0100000
All the precision is there, but these lines in format.POSIXlt
cause options(digits.secs=6)
to not be honored.
np <- getOption("digits.secs")
if (is.null(np))
np <- 0L
else
np <- min(6L, np)
if (np >= 1L) {
for (i in seq_len(np) - 1L) {
if (all(abs(secs - round(secs, i)) < 1e-06)) {
np <- i
break
}
}
}
Due to precision issues, in your example np
is reset to 3 in the above for
loop. And the format "%Y-%m-%d %H:%M:%OS3"
yields the times you posted. You can see the times are accurate if you use the "%Y-%m-%d %H:%M:%OS6"
format.
R> format(as.POSIXlt(index(x)[1:2]), "%Y-%m-%d %H:%M:%OS3")
[1] "2011-01-21 00:00:00.000" "2011-01-21 00:00:00.002"
R> format(as.POSIXlt(index(x)[1:2]), "%Y-%m-%d %H:%M:%OS6")
[1] "2011-01-21 00:00:00.000999" "2011-01-21 00:00:00.002000"
In R, is the %OSn time format only valid for formatting, but not parsing?
This is expected behavior, not a bug. "%OSn"
is for output. "%OS"
is for input, and includes fractional seconds, as it says in your second blockquote:
Further, for
strptime
%OS
will input seconds including fractional seconds.
options(digits.secs=6)
as.POSIXct("2015-06-09 11:24:19.002", "America/New_York", "%Y-%m-%d %H:%M:%OS")
# [1] "2015-06-09 11:24:19.002 EDT"
Also note that "EST"
is an ambiguous timezone, and probably not what you expect. See the Time zone names section of ?timezone
.
Milliseconds puzzle when calling strptime in R
This is related to R-FAQ 7.31, though it takes a different-than-usual guise.
The behavior you are seeing results from a combination of: (a) the inexact representation of (most) decimal values by binary computers; and (b) the documented behavior of strftime
and strptime
, which is to truncate rather than round the fractional parts of seconds, to the specified number of decimal places.
From the ?strptime
help file (the key word being 'truncated'):
Specific to R is ‘%OSn’, which for output gives the seconds
truncated to ‘0 <= n <= 6’ decimal places (and if ‘%OS’ is not
followed by a digit, it uses the setting of
‘getOption("digits.secs")’, or if that is unset, ‘n = 3’).
An example will probably illustrate what's going on more effectively than further explanation:
strftime('2011-10-11 07:49:36.3', format="%Y-%m-%d %H:%M:%OS6")
[1] "2011-10-11 07:49:36.299999"
strptime('2012-01-16 12:00:00.3', format="%Y-%m-%d %H:%M:%OS1")
[1] "2012-01-16 12:00:00.2"
In the example above, the fractional '.3' must be best approximated by a binary number that is slightly less than '0.300000000000000000' -- something like '0.29999999999999999'. Because strptime
and strftime
truncate rather than round to the specified decimal place, 0.3 will be converted to 0.2, if the number of decimal places is set to 1. The same logic holds for your example times, of which half exhibit this behavior, as would (on average) be expected.
as.POSIXct/as.POSIXlt doesn't like .61 milliseconds
If you change the format for the time part to be %H%M%OS
instead of %H%M%S.%OS
, it seems to parse correctly. You may have to adjust your options
so see this:
as.POSIXlt(vec, tz = "EST", format = "%Y%m%d.%H%M%OS")
#[1] "2015-01-01 01:01:01 EST" "2015-01-01 01:01:01 EST"
#[3] "2015-01-01 01:01:01 EST"
options(digits.secs = 2)
as.POSIXlt(vec, tz = "EST", format = "%Y%m%d.%H%M%OS")
# [1] "2015-01-01 01:01:01.60 EST" "2015-01-01 01:01:01.61 EST"
# [3] "2015-01-01 01:01:01.62 EST"
Related Topics
Can't Loop with R's Leaflet Package to Produce Multiple Maps
What Does the @ Symbol Mean in R
How to Add Rtools\Bin to the System Path in R
How to Strip Dollar Signs ($) from Data/ Escape Special Characters in R
Plotting During a Loop in Rstudio
R V3.4.0-2 Unable to Find Libgfortran.So.3 on Arch
Keyed Lookup on Data.Table Without 'With'
R: How to Sum Columns Grouped by a Factor
Change the Position of the Strip Label in Ggplot from the Top to the Bottom
Counting Non Nas in a Data Frame; Getting Answer as a Vector
How to Use R to Download a Zipped File from a Ssl Page That Requires Cookies
How to Use Empty Space Produced by Facet_Wrap
Splitting a Data Frame into Equal Parts
How to Append a Whole Dataframe to a CSV in R