How to Keep Midnight (00:00H) Using Strptime() in R

How can I keep midnight (00:00h) using strptime() in R?

From R's strptime documentation (emphasis added):

format

A character string. The default for the format methods is "%Y-%m-%d %H:%M:%S" if any element has a time component which is not midnight, and "%Y-%m-%d" otherwise. If options("digits.secs") is set, up to the specified number of digits will be printed for seconds.

So the information is still there, you just need to format it to print it out with the time components.

> midnight <- strptime("2015-12-19 00:00:00","%Y-%m-%d %H:%M")
> midnight
[1] "2015-12-19 EST"
> format(midnight,"%Y/%m/%d %H:%M")
[1] "2015/12/19 00:00"

R's strptime of a time datetime at midnight (00:00:00) gives NA

I should use character and not numeric when I parse my dates:

> strptime(20130203000000, "%Y%m%d%H%M%S")    # No!
[1] NA
> strptime("20130203000000", "%Y%m%d%H%M%S") # Yes!
[1] "2013-02-03"

The reason for this seems to be that my numeric value gets cast to character, and I used one too many digits:

> as.character(201302030000)
[1] "201302030000"
> as.character(2013020300000)
[1] "2013020300000"
> as.character(20130203000000)
[1] "2.0130203e+13" # This causes the error: it doesn't fit "%Y%m%d%H%M%S"
> as.character(20130203000001)
[1] "20130203000001" # And this is why anything other than 000000 worked.

A quick lesson in figuring out the type you need from the docs: In R, execute help(strptime) and see a popup similar to the image below.

  • The red arrow points to the main argument to the function, but does not specify the type (which is why I just tried numeric).
  • The green arrow points to the type, which is in the document's title.

Sample Image

Including seconds when using strptime with examples such as 10-10-2010 00:00:00

When you type datetime and hit <Enter>, R will use a/the suitable print method to display datetime. Just because datetime returns "2018-10-10 GMT" doesn't mean that datetime has forgotten about the seconds.

To ensure a consistent format of your POSIXlt object, you could use format

format(datetime, "%Y-%m-%d %H:%M:%S", usetz = T)
#[1] "2018-10-10 00:00:00 GMT"

Similar for case 2

Date <- '2018-10-10'
Time <- '00:00:01'
datetime <- strptime(paste(Date,Time), format = "%Y-%m-%d %H:%M:%S", tz = 'GMT')
format(datetime, "%Y-%m-%d %H:%M:%S", usetz = T)
#[1] "2018-10-10 00:00:01 GMT"

Sample data

Date <- '2018-10-10'
Time <- '00:00:00'
datetime <- strptime(paste(Date,Time), format = "%Y-%m-%d %H:%M:%S", tz = 'GMT')

Subset a dataframe between two time periods

Remove the quotes and add a comma in the end

df1[df1$hours_mins >= 5.25 & df1$hours_mins < 6.25,]

Tagging groups of sequential hours over multiple days

If outages can extend from February to March then we will have to know the year as well so assuming that year stores the year convert to POSIXct using ISOdatetime, take successive differences, compare to 1 hour and take the cumulative sum.

year <- 2000
transform(DF, outage_tag =
cumsum(c(1, diff(ISOdatetime(year, month, day, hour-1, 0, 0, tz = "GMT")) != 1)))

giving:

  month day hour outage_tag
1 1 2 23 1
2 1 2 24 1
3 1 3 1 1
4 1 3 2 1
5 3 5 13 2
6 3 5 14 2
7 3 5 15 2

Note

DF <- structure(list(month = c(1L, 1L, 1L, 1L, 3L, 3L, 3L), day = c(2L, 
2L, 3L, 3L, 5L, 5L, 5L), hour = c(23L, 24L, 1L, 2L, 13L, 14L,
15L)), class = "data.frame",
row.names = c(NA, -7L))

difftime() adds decimal numbers

instead of strptime you can use as.Date like this

difftime(as.Date(dummydat$bluh_datum, format = "%d.%m.%y"), as.Date("10.11.14", format = "%d.%m.%y"), units = "days")
# Time differences in days
# [1] 117 121 119 NA NA 117
difftime(as.Date(dummydat$bluh_datum, format = "%d.%m.%y"), as.Date("10.09.14", format = "%d.%m.%y"), units = "days" )
# Time differences in days
# [1] 178 182 180 NA NA 178

or you have to specify the time zone tz="GMT" like this

difftime(strptime(dummydat$bluh_datum, format="%d.%m.%y", tz = "GMT"), strptime("10.09.14", format="%d.%m.%y", tz = "GMT"), units="days")
# Time differences in days
# [1] 178 182 180 NA NA 178
difftime(strptime(dummydat$bluh_datum, format="%d.%m.%y", tz = "GMT"),strptime("10.11.14", format="%d.%m.%y", tz = "GMT"), units="days")
# Time differences in days
# [1] 117 121 119 NA NA 117

if you do not specify the time zone look what happens

strptime(dummydat$bluh_datum, format="%d.%m.%y")
# [1] "2015-03-07 CET" "2015-03-11 CET" "2015-03-09 CET" NA NA "2015-03-07 CET"
strptime("10.09.14", format="%d.%m.%y")
## [1] "2014-09-10 CEST"

the time zones will be different between dates.

Pulling in data around/after midnight

The problems arising from the change of day can become intractable. I'd tackle this a different way. Start by calculating the day and hour for the five periods your interested in and use DateTime to do the heavy lifting. Then, use a function to provide the schedule item for a particular day/time combination.

Here's a skeleton

<?php
date_default_timezone_set('Pacific/Auckland');
$now = new DateTime();
$oneHour = new DateInterval('PT1H');
$minusOne = (clone $now);
$minusOne->sub($oneHour);
$minusTwo = (clone $minusOne);
$minusTwo->sub($oneHour);
$plusOne = (clone $now);
$plusOne->add($oneHour);
$plusTwo = (clone $plusOne);
$plusTwo->add($oneHour);

echo returnFile($minusTwo);
echo returnFile($minusOne);
echo returnFile($now);
echo returnFile($plusOne);
echo returnFile($plusTwo);

function returnFile(DateTime $t) {
$day = $t->format('D');
$hour = $t->format('G');
// echo "Day:$day, Hour: $hour;<br>";
switch ($day) {
case 'Mon':
if ($hour<7) {
// Small hours Monday...
$filename = "smallMonday.html";
break;
}
if ($hour<12) {
// Monday morning
$filename = "morningMonday.html";
break;
}
break;
case 'Tue':
if ($hour >=23) {
// Late Tuesday
$filename = "lateTuesday.html";
}
default:
$filename = "Some other time";

}
return $filename;
}

?>

I haven't put in a complete schedule - you can work that out.

If you're using PHP 5.5 or later you can use DateTimeImmutable instead of DateTime which does away with all the cloning.

There's a fiddle here



Related Topics



Leave a reply



Submit