Create a 24 Hour Vector with 5 Minutes Time Interval in R

Create a 24 hour vector with 5 minutes time interval in R

There is a seq.POSIXt function which has the nice property that the by argument will get parsed for "numeric interval" meaning of one of "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year . Then, if you print the results with format(z, "%H%M", tz="GMT") it can appear as desired:

format( seq.POSIXt(as.POSIXct(Sys.Date()), as.POSIXct(Sys.Date()+1), by = "5 min"),
"%H%M", tz="GMT") # hours (00-23) and min (00-59) and no space
[1] "0000" "0005" "0010" "0015" "0020" "0025" "0030" "0035" "0040" "0045" "0050"
[12] "0055" "0100" "0105" "0110" "0115" "0120" "0125" "0130" "0135" "0140" "0145"
[23] "0150" "0155" "0200" "0205" "0210" "0215" "0220" "0225" "0230" "0235" "0240"
[34] "0245" "0250" "0255" "0300" "0305" "0310" "0315" "0320" "0325" "0330" "0335"
[45] "0340" "0345" snipped the rest.

Unless you are within 360/48 degrees of Greenwich (or is it Paris) you need to put in the tz="GMT" so that the offset for your timezone does not mess this up. Without that this produced a sequence starting at "1700" for me. You could assign the inner result to a name if you needed to keep it available in your workspace, but it would not be a character value but rather a POSIXct object (numeric mode with class defined methods for display and manipulation):

z <- seq.POSIXt(as.POSIXct(Sys.Date()), as.POSIXct(Sys.Date()+1), by = "5 min")
> z[1]
[1] "2014-09-09 17:00:00 PDT"

Create a 24 hour vector with 60 and 1 minutes time interval in R

Quite a few things have to be done to get the summaries data. One can use dplyr, tidyr and lubridate packages to transform data.

The approach:

  1. Create DateTime column by uniting date and hour and converting to
    ymd_hms
  2. Group on src_addres, dest_address, and Year-Month-Day Hour to
    calculate hourly occurrence
  3. Group on src_addres, dest_address, and Year-Month-Day Hour:Min to calculate > per min occurrence
  4. Group on src_addres, dest_address and summarize to get max of hourly and per min occurrence
library(dplyr)
library(tidyr)
library(lubridate)

df %>% unite("DateTime", c("date","hour"), sep=" ") %>%
mutate(DateTime = ymd_hms(DateTime)) %>%
group_by(src_addres, dest_address, YMD_H = format(DateTime, "%Y-%m-%d %H")) %>%
mutate(HourlyAppearance = n()) %>%
group_by(src_addres, dest_address, YMD_HM = format(DateTime, "%Y-%m-%d %H:%M")) %>%
mutate(PerMinAppearance = n()) %>%
group_by(src_addres, dest_address) %>%
summarise( 'max(per hour)' = max(HourlyAppearance),
'max(per min)' = max(PerMinAppearance)) %>%
as.data.frame()

# src_addres dest_address max(per hour) max(per min)
# 1 1.11.201.19 172.16.16.100 1 1
# 2 1.119.43.90 172.16.16.100 1 1
# 3 1.119.43.90 172.16.16.153 1 1
# 4 1.119.43.90 192.168.1.112 1 1
# 5 1.171.43.133 172.16.16.5 2 2
# 6 1.179.191.82 172.16.16.5 1 1
# 7 1.179.191.82 192.168.1.111 1 1
# 8 1.179.191.82 192.168.1.112 1 1
# 9 1.180.72.186 172.16.16.153 1 1
# 10 1.202.165.40 172.16.16.153 1 1
# 11 1.203.84.52 172.16.16.5 1 1
# 12 1.203.84.52 192.168.1.112 1 1
# 13 1.209.171.4 192.168.1.111 1 1
# 14 1.214.34.114 172.16.16.100 1 1
# 15 1.214.34.114 172.16.16.153 1 1
# 16 1.214.34.114 172.16.16.5 1 1
# 17 1.214.34.114 192.168.1.111 1 1
# 18 1.214.34.114 192.168.1.112 1 1
# 19 1.55.249.92 172.16.16.153 1 1
# 20 1.55.249.92 172.16.16.5 1 1
# 21 1.71.188.254 172.16.16.100 1 1
# 22 1.71.188.254 172.16.16.153 1 1
# 23 1.71.188.254 172.16.16.159 1 1
# 24 1.71.188.254 172.16.16.5 1 1
# 25 1.71.188.254 192.168.1.111 1 1
# 26 1.71.188.254 192.168.1.112 1 1
# 27 1.85.18.88 172.16.16.153 1 1

Data:

OP hasn't provided data in a pretty simple format. The inclusion of date and time columns has made it more difficult. Perhaps that be reason for low response to this question. I preferred to read date and time part separately and then unite those to get Date/Time.

strtext <- "Sl  date hour  src_addres  dest_address  Date_t   Time_t
1996 2018-04-14 08:24:01 1.11.201.19 172.16.16.100 2018-04-14 08:24:01
3702 2018-04-15 12:10:27 1.119.43.90 172.16.16.100 2018-04-15 12:10:27
1154 2018-04-14 00:59:27 1.119.43.90 172.16.16.153 2018-04-14 00:59:27
2414 2018-04-14 12:33:29 1.119.43.90 192.168.1.112 2018-04-14 12:33:29
18013 2018-04-28 18:49:05 1.171.43.133 172.16.16.5 2018-04-28 18:49:05
18015 2018-04-28 18:49:05 1.171.43.133 172.16.16.5 2018-04-28 18:49:05
6903 2018-04-25 21:31:52 1.179.191.82 172.16.16.5 2018-04-25 21:31:52
11741 2018-04-27 01:08:43 1.179.191.82 192.168.1.111 2018-04-27 01:08:43
11933 2018-04-27 02:00:10 1.179.191.82 192.168.1.111 2018-04-27 02:00:10
11023 2018-04-26 21:39:39 1.179.191.82 192.168.1.112 2018-04-26 21:39:39
11175 2018-04-26 22:31:01 1.179.191.82 192.168.1.112 2018-04-26 22:31:01
13073 2018-04-27 08:24:58 1.180.72.186 172.16.16.153 2018-04-27 08:24:58
13735 2018-04-27 12:07:34 1.180.72.186 172.16.16.153 2018-04-27 12:07:34
2752 2018-04-14 19:34:53 1.202.165.40 172.16.16.153 2018-04-14 19:34:53
4046 2018-04-15 18:16:40 1.203.84.52 172.16.16.5 2018-04-15 18:16:40
4048 2018-04-15 18:18:43 1.203.84.52 192.168.1.112 2018-04-15 18:18:43
3020 2018-04-15 01:35:40 1.209.171.4 192.168.1.111 2018-04-15 01:35:40
4870 2018-04-16 05:33:42 1.214.34.114 172.16.16.100 2018-04-16 05:33:42
7025 2018-04-25 22:28:06 1.214.34.114 172.16.16.100 2018-04-25 22:28:06
4262 2018-04-15 23:31:56 1.214.34.114 172.16.16.153 2018-04-15 23:31:56
9369 2018-04-26 10:32:50 1.214.34.114 172.16.16.153 2018-04-26 10:32:50
2716 2018-04-14 18:49:30 1.214.34.114 172.16.16.5 2018-04-14 18:49:30
9563 2018-04-26 12:34:58 1.214.34.114 172.16.16.5 2018-04-26 12:34:58
1110 2018-04-14 00:27:02 1.214.34.114 192.168.1.111 2018-04-14 00:27:02
4470 2018-04-16 01:27:32 1.214.34.114 192.168.1.112 2018-04-16 01:27:32
9581 2018-04-26 12:55:39 1.55.249.92 172.16.16.153 2018-04-26 12:55:39
2970 2018-04-15 00:01:18 1.55.249.92 172.16.16.5 2018-04-15 00:01:18
15329 2018-04-27 21:53:16 1.55.249.92 172.16.16.5 2018-04-27 21:53:16
15537 2018-04-28 00:02:30 1.55.249.92 172.16.16.5 2018-04-28 00:02:30
19249 2018-04-29 06:28:04 1.71.188.254 172.16.16.100 2018-04-29 06:28:04
19243 2018-04-29 06:28:04 1.71.188.254 172.16.16.153 2018-04-29 06:28:04
19241 2018-04-29 06:28:04 1.71.188.254 172.16.16.159 2018-04-29 06:28:04
19239 2018-04-29 06:28:04 1.71.188.254 172.16.16.5 2018-04-29 06:28:04
19247 2018-04-29 06:28:04 1.71.188.254 192.168.1.111 2018-04-29 06:28:04
19245 2018-04-29 06:28:04 1.71.188.254 192.168.1.112 2018-04-29 06:28:04
6315 2018-04-25 18:56:08 1.85.18.88 172.16.16.153 2018-04-25 18:56:08
14623 2018-04-27 16:41:00 1.85.18.88 172.16.16.153 2018-04-27 16:41:00"

df <- read.table(text = strtext,header = TRUE, stringsAsFactors = FALSE)

Create a time series with a row every 15 minutes

Little bit easier to read

library(lubridate)
seq(ymd_hm('2015-01-01 00:00'),ymd_hm('2016-12-31 23:45'), by = '15 mins')

Create vector of non-weekend time intervals for part of a day in R

One of the issues is that your interval vector does not change the hour when the minutes go over 60.

Here is one way you could do this:

#create the interval vector
intervals<-c()
for(p in 6:20){
for(j in seq(0,55,by=5)){
intervals<-c(intervals,paste(p,j,sep=":"))
}
}
intervals<-c(intervals,"21:0")

#get the days
dayseq <- timeBasedSeq("2010-05-24/2010-11-05/d")

#concatenate everything and format to POSIXct at the end
obstime<-strptime(unlist(lapply(dayseq,function(x){paste(x,intervals)})),format="%Y-%m-%d %H:%M", tz="GMT")

How do you create vectors with specific intervals in R?

In R the equivalent function is seq and you can use it with the option by:

seq(from = 5, to = 100, by = 5)
# [1] 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

In addition to by you can also have other options such as length.out and along.with.

length.out: If you want to get a total of 10 numbers between 0 and 1, for example:

seq(0, 1, length.out = 10)
# gives 10 equally spaced numbers from 0 to 1

along.with: It takes the length of the vector you supply as input and provides a vector from 1:length(input).

seq(along.with=c(10,20,30))
# [1] 1 2 3

Although, instead of using the along.with option, it is recommended to use seq_along in this case. From the documentation for ?seq

seq is generic, and only the default method is described here. Note that it dispatches on the class of the first argument irrespective of argument names. This can have unintended consequences if it is called with just one argument intending this to be taken as along.with: it is much better to use seq_along in that case.

seq_along: Instead of seq(along.with(.))

seq_along(c(10,20,30))
# [1] 1 2 3

Hope this helps.

R test for morning rush hour - time vector in interval

I would do something way simpler than that using difftime and cut. You can do the following (using base functions):

morning.rush.hour<-function(tm){
difftime(tm, cut(tm, breaks="days"), units="hours") -> dt #This is to transform the time of day into a numeric (7:30 and 9:30 being respectively 7.5 and 9.5)
(tm$wday %in% 1:5) & (dt <= 9.5) & (dt >= 7.5) #So: Is it a weekday, it is before 9:30 and is it after 7:30?
}

Edit: You can also add a time-zone parameter to difftime if needed:

difftime(tm, cut(tm, breaks="days"), units="hours", tz="UTC")

Generate a list of string csv names with 15 minutes time interval and today date

you could use something like this. If you want you could also add seconds.

library(stringi)

date <- gsub("-", "", Sys.Date())

Hours <- str_pad(string = rep(seq(0, 23, 1), each = 4), width= 2, side = c("left"), pad = "0")
Minutes <- str_pad(string = rep(seq(0, 45, 15), 24), width= 2, side = c("left"), pad = "0")

paste0("data_", date,Hours, Minutes, ".csv")


Related Topics



Leave a reply



Submit