Difference between as.POSIXct/as.POSIXlt and strptime for converting character vectors to POSIXct/POSIXlt
Well, the functions do different things.
First, there are two internal implementations of date/time: POSIXct
, which stores seconds since UNIX epoch (+some other data), and POSIXlt
, which stores a list of day, month, year, hour, minute, second, etc.
strptime
is a function to directly convert character vectors (of a variety of formats) to POSIXlt
format.
as.POSIXlt
converts a variety of data types to POSIXlt
. It tries to be intelligent and do the sensible thing - in the case of character, it acts as a wrapper to strptime
.
as.POSIXct
converts a variety of data types to POSIXct
. It also tries to be intelligent and do the sensible thing - in the case of character, it runs strptime
first, then does the conversion from POSIXlt
to POSIXct
.
It makes sense that strptime
is faster, because strptime
only handles character input whilst the others try to determine which method to use from input type. It should also be a bit safer in that being handed unexpected data would just give an error, instead of trying to do the intelligent thing that might not be what you want.
differences between subsetting POSIXlt and POSIXct in R
You misunderstand a critical difference between POSIXlt
and POSIXct
:
POSIXlt
is a 'list type' with components you can access as you doPOSIXct
is a 'compact type' that is essentially just a number
You almost always want POSIXct
for comparison and effective storage (eg in a data.frame
, or to index a zoo
or xts
object with) and can use POSIXlt
to access components. Be warned, though, that the components follow C library standards so e.g. the current years is 115 (as you always need to add 1900), weekdays start at zero etc pp.
Doing str()
or unclass
on these is illuminating. For historical reasons, strptime()
returns a POSIXlt
. I wish it would return a POSIXct
.
Why do some dates become NA when converted from character to POSIXlt?
I updated my code to specify the GMT timezone as the data is collected in GMT without a change to or from daylight savings time.
dateValue <- strptime(dateString, format='%m/%d/%y %I:%M:%S %p', tz="GMT")
This ensures properly formatted date time values are not evaluated to TRUE with is.na()
Converting datetime from character to POSIXct object
For your real data issue replace the %m%
with %m
:
## Reading in the file:
fpath <- "c:/r/data/real_data.txt"
x <- read.csv(fpath, skip = 1, header = FALSE, sep = "", stringsAsFactors = FALSE)
names(x) <- c("date","time","bscat","scat_coef","pressure_mbar","temp_K","CH1","CH2") ## This is data from a Radiance Research Integrating Nephelometer Model M903 for anyone who is interested!
## issue was the %m% - fixed
x$datetime1 <- as.POSIXct(paste(x$date, x$time), format = "%Y-%m-%d %H:%M:%S", tz = "UTC")
## Here too - fixed
x$datetime2 <- strptime(paste(x$date, x$time), format = "%Y-%m-%d %H:%M:%S", tz = "UTC")
head(x)
Determine and set timezone in POSIXct, POSIXlt, strptime, etc. in R
If you do not use a timezone specifically, POSIXct and POSIXlt will reference to your local timezone. However, this is not entirely reliable. POSIXlt will not display the timezone in the output string.
Note, the tzone argument is not set.
t.ct <- as.POSIXct("2009-01-05 14:19 +1200", format="%Y-%m-%d %H:%M %z")
t.lt <- as.POSIXlt("2009-01-05 14:19 +1200", format="%Y-%m-%d %H:%M %z")
t.ct
t.lt
attr(t.ct,"tzone") #""
attr(t.lt,"tzone") #NULL
If you do want to avoid ambiguous behaviour, you have to specifiy a time zone. The output string will still be different (by default POSIXlt shows no timezone), but the attribute is the same
t.ct <- as.POSIXct("2009-01-05 14:19 +1200", format="%Y-%m-%d %H:%M %z", tz="Europe/Helsinki")
t.lt <- as.POSIXlt("2009-01-05 14:19 +1200", format="%Y-%m-%d %H:%M %z", tz="Europe/Helsinki")
t.ct
t.lt
attr(t.ct,"tzone") #Europe/Helsinki
attr(t.lt,"tzone") #Europe/Helsinki
Now, if you want to change time zones after the original assignment:
attr(t.ct, "tzone") <- "UTC" #this will SHIFT the time zone to UTC
attr(t.lt, "tzone") <- "UTC" #this will REPLACE the time zone to UTC
t.ct
t.lt
As for your problem with strftime
and %z
, this does not give you the time zone attribute. The difference in your case, probably comes from a combination of string formatting, object conversions and time zone formating, IMO. But maybe somebody more knowledgable, can clarify this.
Converting dates with R using as.POSIXct
Two mistakes:
- you used
%H
where you want%I
for the dreaded 12-hour format - you omitted
%p
to catch the "pm" marker
With that corrected:
R> date_string <- "03/11/2017, 3:14:32 pm"
R> as.POSIXct(date_string, format = "%m/%d/%Y, %I:%M:%S %p",tz="PST8PDT")
[1] "2017-03-11 15:14:32 PST"
R>
NA for 1 particular date when converting dates from character format to POSIXct with as.POSIXct
We can specify the %T
for time. In the format, there are minutes, seconds and millseconds. So, the %H
is only matching the hour part
as.POSIXct("2017-03-26 02:00:00.000",format="%Y-%m-%d %T")
[1] "2017-03-26 02:00:00 EDT"
Or to take care of the milliseconds as well
as.POSIXct("2017-03-26 02:00:00.000",format="%Y-%m-%d %H:%M:%OS")
#[1] "2017-03-26 02:00:00 EDT"
Or using lubridate
library(lubridate)
ymd_hms("2017-03-26 02:00:00.000")
Related Topics
Using Multiple Criteria in Subset Function and Logical Operators
What Is "Object of Type 'Closure' Is Not Subsettable" Error in Shiny
Anova Test Fails on Lme Fits Created with Pasted Formula
Performing Dplyr Mutate on Subset of Columns
Issue with Geom_Text When Using Position_Dodge
Meaning of Ddply Error: 'Names' Attribute [9] Must Be the Same Length as the Vector [1]
Drop-Down Checkbox Input in Shiny
How to One Hot Encode Several Categorical Variables in R
Seeing If Data Is Normally Distributed in R
Saving Grid.Arrange() Plot to File
Rstudio Rmarkdown: Both Portrait and Landscape Layout in a Single PDF
Create Categories by Comparing a Numeric Column with a Fixed Value
How to Install an R Package from the Source Tarball on Windows
Remove Rows from Data Frame Where a Row Matches a String
How to Flatten a List of Lists
Convert Currency with Commas into Numeric
Creating a New Variable from a Lookup Table
In R Markdown in Rstudio, How to Prevent the Source Code from Running Off a PDF Page