Create a Time Interval of 15 Minutes from Minutely Data in R

Create a time interval of 15 minutes from minutely data in R?

For data that's in POSIXct format, you can use the cut function to create 15-minute groupings, and then aggregate by those groups. The code below shows how to do this in base R and with the dplyr and data.table packages.

First, create some fake data:

set.seed(4984)
dat = data.frame(time=seq(as.POSIXct("2016-05-01"), as.POSIXct("2016-05-01") + 60*99, by=60),
                 count=sample(1:50, 100, replace=TRUE))

Base R

cut the data into 15 minute groups:

dat$by15 = cut(dat$time, breaks="15 min")

                   time count                by15
1   2016-05-01 00:00:00    22 2016-05-01 00:00:00
2   2016-05-01 00:01:00    11 2016-05-01 00:00:00
3   2016-05-01 00:02:00    31 2016-05-01 00:00:00
...
98  2016-05-01 01:37:00    20 2016-05-01 01:30:00
99  2016-05-01 01:38:00    29 2016-05-01 01:30:00
100 2016-05-01 01:39:00    37 2016-05-01 01:30:00

Now aggregate by the new grouping column, using sum as the aggregation function:

dat.summary = aggregate(count ~ by15, FUN=sum, data=dat)

                 by15 count
1 2016-05-01 00:00:00   312
2 2016-05-01 00:15:00   395
3 2016-05-01 00:30:00   341
4 2016-05-01 00:45:00   318
5 2016-05-01 01:00:00   349
6 2016-05-01 01:15:00   397
7 2016-05-01 01:30:00   341

dplyr

library(dplyr)

dat.summary = dat %>% group_by(by15=cut(time, "15 min")) %>%
  summarise(count=sum(count))

data.table

library(data.table)

dat.summary = setDT(dat)[ , list(count=sum(count)), by=cut(time, "15 min")]

UPDATE: To answer the comment, for this case the end point of each grouping interval is as.POSIXct(as.character(dat$by15)) + 60*15 - 1. In other words, the endpoint of the grouping interval is 15 minutes minus one second from the start of the interval. We add 60*15 - 1 because POSIXct is denominated in seconds. The as.POSIXct(as.character(...)) is because cut returns a factor and this just converts it back to date-time so that we can do math on it.

If you want the end point to the nearest minute before the next interval (instead of the nearest second), you could to as.POSIXct(as.character(dat$by15)) + 60*14.

If you don't know the break interval, for example, because you chose the number of breaks and let R pick the interval, you could find the number of seconds to add by doing max(unique(diff(as.POSIXct(as.character(dat$by15))))) - 1.

Create a time series with a row every 15 minutes

Little bit easier to read

library(lubridate)
seq(ymd_hm('2015-01-01 00:00'),ymd_hm('2016-12-31 23:45'), by = '15 mins')

Creating regular 15-minute time-series from irregular time-series

xts extends zoo, and zoo has extensive examples for this in its vignettes and documentation.

Here is a worked example. I think I have done that more elegantly in the past, but this is all I am coming up with now:

R> twohours <- ISOdatetime(2012,05,02,9,0,0) + seq(0:7)*15*60
R> twohours
[1] "2012-05-02 09:15:00 GMT" "2012-05-02 09:30:00 GMT" 
[3] "2012-05-02 09:45:00 GMT" "2012-05-02 10:00:00 GMT" 
[5] "2012-05-02 10:15:00 GMT" "2012-05-02 10:30:00 GMT" 
[7] "2012-05-02 10:45:00 GMT" "2012-05-02 11:00:00 GMT"
R> set.seed(42)
R> observation <- xts(1:10, order.by=twohours[1]+cumsum(runif(10)*60*10))
R> observation
                           [,1]
2012-05-02 09:24:08.883625    1
2012-05-02 09:33:31.128874    2
2012-05-02 09:36:22.812594    3
2012-05-02 09:44:41.081170    4
2012-05-02 09:51:06.128481    5
2012-05-02 09:56:17.586051    6
2012-05-02 10:03:39.539040    7
2012-05-02 10:05:00.338998    8
2012-05-02 10:11:34.534372    9
2012-05-02 10:18:37.573243   10

A two hour time grid, and some random observations leaving some cells empty and some
filled.

R> to.minutes15(observation)[,4]
                           observation.Close
2012-05-02 09:24:08.883625                 1
2012-05-02 09:44:41.081170                 4
2012-05-02 09:56:17.586051                 6
2012-05-02 10:11:34.534372                 9
2012-05-02 10:18:37.573243                10

That is a 15 minutes grid aggregation but not on our time grid.

R> twoh <- xts(rep(NA,8), order.by=twohours)
R> twoh
                    [,1]
2012-05-02 09:15:00   NA
2012-05-02 09:30:00   NA
2012-05-02 09:45:00   NA
2012-05-02 10:00:00   NA
2012-05-02 10:15:00   NA
2012-05-02 10:30:00   NA
2012-05-02 10:45:00   NA
2012-05-02 11:00:00   NA

R> merge(twoh, observation)
                           twoh observation
2012-05-02 09:15:00.000000   NA          NA
2012-05-02 09:24:08.883625   NA           1
2012-05-02 09:30:00.000000   NA          NA
2012-05-02 09:33:31.128874   NA           2
2012-05-02 09:36:22.812594   NA           3
2012-05-02 09:44:41.081170   NA           4
2012-05-02 09:45:00.000000   NA          NA
2012-05-02 09:51:06.128481   NA           5
2012-05-02 09:56:17.586051   NA           6
2012-05-02 10:00:00.000000   NA          NA
2012-05-02 10:03:39.539040   NA           7
2012-05-02 10:05:00.338998   NA           8
2012-05-02 10:11:34.534372   NA           9
2012-05-02 10:15:00.000000   NA          NA
2012-05-02 10:18:37.573243   NA          10
2012-05-02 10:30:00.000000   NA          NA
2012-05-02 10:45:00.000000   NA          NA
2012-05-02 11:00:00.000000   NA          NA

New xts object, and merged object. Now use na.locf() to carry the observations
forward:

R> na.locf(merge(twoh, observation)[,2])
                           observation
2012-05-02 09:15:00.000000          NA
2012-05-02 09:24:08.883625           1
2012-05-02 09:30:00.000000           1
2012-05-02 09:33:31.128874           2
2012-05-02 09:36:22.812594           3
2012-05-02 09:44:41.081170           4
2012-05-02 09:45:00.000000           4
2012-05-02 09:51:06.128481           5
2012-05-02 09:56:17.586051           6
2012-05-02 10:00:00.000000           6
2012-05-02 10:03:39.539040           7
2012-05-02 10:05:00.338998           8
2012-05-02 10:11:34.534372           9
2012-05-02 10:15:00.000000           9
2012-05-02 10:18:37.573243          10
2012-05-02 10:30:00.000000          10
2012-05-02 10:45:00.000000          10
2012-05-02 11:00:00.000000          10

And then we can merge again as an inner join on the time-grid xts twoh:

R> merge(twoh, na.locf(merge(twoh, observation)[,2]), join="inner")[,2]
                    observation
2012-05-02 09:15:00          NA
2012-05-02 09:30:00           1
2012-05-02 09:45:00           4
2012-05-02 10:00:00           6
2012-05-02 10:15:00           9
2012-05-02 10:30:00          10
2012-05-02 10:45:00          10
2012-05-02 11:00:00          10
R>

How to extract time interval data from minute data in r

You can extract rows with a minute-mark ending in 0 or 5 with

df[substr(format(df$Time, '%M'), 2, 2) %in% c(0, 5),]
# or 
df[as.numeric(format(df$Time, '%M')) %% 5 == 0,]
# or 
df[grep('[0|5]$', format(df$Time, '%M')),]

With filter:

library(dplyr)
df %>% 
  filter(substr(format(df$Time, '%M'), 2, 2) %in% c(0, 5))

# or 

df %>% 
  filter(as.numeric(format(df$Time, '%M')) %% 5 == 0)

How to create a time interval that count the rows in such time interval in R

Since there is not a reproducible example, I attempt the solution on a simulated data frame. First we create a log of calls with ID and time:

library(lubridate)
library(dplyr)
library(magrittr)
set.seed(123)

# Generate 100 random call times during a day
calls.df <- data.frame(id=seq(1,100,1), calltime=sample(seq(as.POSIXct('2019/10/01'),
     as.POSIXct('2019/10/02'), by="min"), 100))

There may not be all intervals represented in your call data so generate a sequence of all 30 minute bins in case:

full.df <- data.frame(bin=seq(as.POSIXct('2019/10/01'), as.POSIXct('2019/10/02'), by="30 min"))

Next tally up counts of calls in represented bins:

calls.df %>% arrange(calltime) %>% mutate(diff=interval(lag(calltime),calltime)) %>% 
     mutate(mins=diff@.Data/60) %>% select(-diff) %>% 
     mutate(bin=floor_date(calltime, unit="30 minutes")) %>% 
     group_by(bin) %>% tally() -> orig.counts

Now make sure there are zeroes for unrepresented bins:

right_join(orig.counts,full.df,by="bin") %>% mutate(count=ifelse(is.na(n), 0, n))

  # A tibble: 49 x 3
     bin                     n count
     <dttm>              <int> <dbl>
   1 2019-10-01 00:00:00     2     2
   2 2019-10-01 00:30:00     1     1
   3 2019-10-01 01:00:00     2     2
   4 2019-10-01 01:30:00    NA     0
   5 2019-10-01 02:00:00     2     2
   6 2019-10-01 02:30:00     4     4
   7 2019-10-01 03:00:00     1     1
   8 2019-10-01 03:30:00     1     1
   9 2019-10-01 04:00:00     2     2
  10 2019-10-01 04:30:00     1     1
  # ... with 39 more rows

Hope this is helpful for you.

Create a 24 hour vector with 60 and 1 minutes time interval in R

Quite a few things have to be done to get the summaries data. One can use dplyr, tidyr and lubridate packages to transform data.

The approach:

Create DateTime column by uniting date and hour and converting to
ymd_hms

Group on src_addres, dest_address, and Year-Month-Day Hour to
calculate hourly occurrence

Group on src_addres, dest_address, and Year-Month-Day Hour:Min to calculate > per min occurrence

Group on src_addres, dest_address and summarize to get max of hourly and per min occurrence

library(dplyr)
library(tidyr)
library(lubridate)

df %>% unite("DateTime", c("date","hour"), sep=" ") %>% 
  mutate(DateTime = ymd_hms(DateTime)) %>%
  group_by(src_addres, dest_address, YMD_H = format(DateTime, "%Y-%m-%d %H")) %>%
  mutate(HourlyAppearance = n()) %>%
  group_by(src_addres, dest_address, YMD_HM = format(DateTime, "%Y-%m-%d %H:%M")) %>%
  mutate(PerMinAppearance = n()) %>%
  group_by(src_addres, dest_address) %>%
  summarise( 'max(per hour)' = max(HourlyAppearance), 
           'max(per min)' = max(PerMinAppearance)) %>%
  as.data.frame()

#      src_addres  dest_address max(per hour) max(per min)
# 1   1.11.201.19 172.16.16.100             1            1
# 2   1.119.43.90 172.16.16.100             1            1
# 3   1.119.43.90 172.16.16.153             1            1
# 4   1.119.43.90 192.168.1.112             1            1
# 5  1.171.43.133   172.16.16.5             2            2
# 6  1.179.191.82   172.16.16.5             1            1
# 7  1.179.191.82 192.168.1.111             1            1
# 8  1.179.191.82 192.168.1.112             1            1
# 9  1.180.72.186 172.16.16.153             1            1
# 10 1.202.165.40 172.16.16.153             1            1
# 11  1.203.84.52   172.16.16.5             1            1
# 12  1.203.84.52 192.168.1.112             1            1
# 13  1.209.171.4 192.168.1.111             1            1
# 14 1.214.34.114 172.16.16.100             1            1
# 15 1.214.34.114 172.16.16.153             1            1
# 16 1.214.34.114   172.16.16.5             1            1
# 17 1.214.34.114 192.168.1.111             1            1
# 18 1.214.34.114 192.168.1.112             1            1
# 19  1.55.249.92 172.16.16.153             1            1
# 20  1.55.249.92   172.16.16.5             1            1
# 21 1.71.188.254 172.16.16.100             1            1
# 22 1.71.188.254 172.16.16.153             1            1
# 23 1.71.188.254 172.16.16.159             1            1
# 24 1.71.188.254   172.16.16.5             1            1
# 25 1.71.188.254 192.168.1.111             1            1
# 26 1.71.188.254 192.168.1.112             1            1
# 27   1.85.18.88 172.16.16.153             1            1

Data:

OP hasn't provided data in a pretty simple format. The inclusion of date and time columns has made it more difficult. Perhaps that be reason for low response to this question. I preferred to read date and time part separately and then unite those to get Date/Time.

strtext <- "Sl  date hour  src_addres  dest_address  Date_t   Time_t
1996  2018-04-14 08:24:01    1.11.201.19 172.16.16.100 2018-04-14 08:24:01
3702  2018-04-15 12:10:27    1.119.43.90 172.16.16.100 2018-04-15 12:10:27
1154  2018-04-14 00:59:27    1.119.43.90 172.16.16.153 2018-04-14 00:59:27
2414  2018-04-14 12:33:29    1.119.43.90 192.168.1.112 2018-04-14 12:33:29
18013 2018-04-28 18:49:05   1.171.43.133   172.16.16.5 2018-04-28 18:49:05
18015 2018-04-28 18:49:05   1.171.43.133   172.16.16.5 2018-04-28 18:49:05
6903  2018-04-25 21:31:52   1.179.191.82   172.16.16.5 2018-04-25 21:31:52
11741 2018-04-27 01:08:43   1.179.191.82 192.168.1.111 2018-04-27 01:08:43
11933 2018-04-27 02:00:10   1.179.191.82 192.168.1.111 2018-04-27 02:00:10
11023 2018-04-26 21:39:39   1.179.191.82 192.168.1.112 2018-04-26 21:39:39
11175 2018-04-26 22:31:01   1.179.191.82 192.168.1.112 2018-04-26 22:31:01
13073 2018-04-27 08:24:58   1.180.72.186 172.16.16.153 2018-04-27 08:24:58
13735 2018-04-27 12:07:34   1.180.72.186 172.16.16.153 2018-04-27 12:07:34
2752  2018-04-14 19:34:53   1.202.165.40 172.16.16.153 2018-04-14 19:34:53
4046  2018-04-15 18:16:40    1.203.84.52   172.16.16.5 2018-04-15 18:16:40
4048  2018-04-15 18:18:43    1.203.84.52 192.168.1.112 2018-04-15 18:18:43
3020  2018-04-15 01:35:40    1.209.171.4 192.168.1.111 2018-04-15 01:35:40
4870  2018-04-16 05:33:42   1.214.34.114 172.16.16.100 2018-04-16 05:33:42
7025  2018-04-25 22:28:06   1.214.34.114 172.16.16.100 2018-04-25 22:28:06
4262  2018-04-15 23:31:56   1.214.34.114 172.16.16.153 2018-04-15 23:31:56
9369  2018-04-26 10:32:50   1.214.34.114 172.16.16.153 2018-04-26 10:32:50
2716  2018-04-14 18:49:30   1.214.34.114   172.16.16.5 2018-04-14 18:49:30
9563  2018-04-26 12:34:58   1.214.34.114   172.16.16.5 2018-04-26 12:34:58
1110  2018-04-14 00:27:02   1.214.34.114 192.168.1.111 2018-04-14 00:27:02
4470  2018-04-16 01:27:32   1.214.34.114 192.168.1.112 2018-04-16 01:27:32
9581  2018-04-26 12:55:39    1.55.249.92 172.16.16.153 2018-04-26 12:55:39
2970  2018-04-15 00:01:18    1.55.249.92   172.16.16.5 2018-04-15 00:01:18
15329 2018-04-27 21:53:16    1.55.249.92   172.16.16.5 2018-04-27 21:53:16
15537 2018-04-28 00:02:30    1.55.249.92   172.16.16.5 2018-04-28 00:02:30
19249 2018-04-29 06:28:04   1.71.188.254 172.16.16.100 2018-04-29 06:28:04
19243 2018-04-29 06:28:04   1.71.188.254 172.16.16.153 2018-04-29 06:28:04
19241 2018-04-29 06:28:04   1.71.188.254 172.16.16.159 2018-04-29 06:28:04
19239 2018-04-29 06:28:04   1.71.188.254   172.16.16.5 2018-04-29 06:28:04
19247 2018-04-29 06:28:04   1.71.188.254 192.168.1.111 2018-04-29 06:28:04
19245 2018-04-29 06:28:04   1.71.188.254 192.168.1.112 2018-04-29 06:28:04
6315  2018-04-25 18:56:08     1.85.18.88 172.16.16.153 2018-04-25 18:56:08
14623 2018-04-27 16:41:00     1.85.18.88 172.16.16.153 2018-04-27 16:41:00"

df <- read.table(text = strtext,header = TRUE, stringsAsFactors = FALSE)

5-minutes interval time series in R

Do you just want a sequence of dates at 5 min intervals, with a time zone? If so you can use seq.PosiXt as described here

See my example below:

start_date <- as.POSIXct(paste0("2010/01/01", "00:00:00"), format= "%Y/%m/%d %H:%M:%S", tz = "Etc/GMT+2")
end_date <- as.POSIXct(paste0("2018/01/01","00:00:01"), format= "%Y/%m/%d %H:%M:%S", tz = "Etc/GMT+2")


output <- seq.POSIXt(start_date, end_date, by = "5 min")
formatted <- format_iso_8601(output)

Create a Time Interval of 15 Minutes from Minutely Data in R