Binning Time Data in R

Binning time data in R

Just use ?cut as it has a method for ?cut.POSIXt date/times. E.g.:

x <- as.POSIXct("2016-01-01 00:00:00", tz="UTC") + as.difftime(30*(0:47),units="mins")
cut(x, breaks="2 hours", labels=FALSE)
# or to show more clearly the results:
data.frame(x, cuts = cut(x, breaks="2 hours", labels=FALSE))

#                     x cuts
#1  2016-01-01 00:00:00    1
#2  2016-01-01 00:30:00    1
#3  2016-01-01 01:00:00    1
#4  2016-01-01 01:30:00    1
#5  2016-01-01 02:00:00    2
#6  2016-01-01 02:30:00    2
#7  2016-01-01 03:00:00    2
#8  2016-01-01 03:30:00    2
#9  2016-01-01 04:00:00    3
#10 2016-01-01 04:30:00    3
# ...

If your data are just strings, then you need to do a conversion first. Times will end up assigned to the current day if you don't specify a particular day as well.

as.POSIXct("17:23:54", format="%H:%M:%S", tz="UTC")
#[1] "2016-07-13 17:23:54 UTC"

Binning data by time in R

You can use floor_date to round down the Time for each minute and take sum in each group.

library(dplyr)
library(lubridate)

df %>%
  mutate(Time = ymd_hms(Time)) %>%
  group_by(ID, Time = floor_date(Time, "1 min")) %>%
  summarise(Data = sum(Data))

Create time bins and assign data to correct bin

I tried to solve this using data.table and lubridate and sticking to my idea of using floor_date.

# load packages
library(data.table)
library(lubridate)

# define a vector evenly spaced each 30 minutes:
b <- data.table(dates = seq(as.POSIXct("2018-03-25", tz = "UTC"), 
                            as.POSIXct("2018-03-26", tz = "UTC"), 
                            by = "30 min"))

# reproduce data
dt <- data.table(detect_date = as.character(c("25/03/2018 00:09", "25/03/2018 01:17", "25/03/2016 14:37", "25/03/2016 23:43")), 
                 Station = c("SS01", "SS03", "SS04", "SS04"), 
                 Individual = c("A", "B", "C", "B"))

# convert detect_date to date format
dt[, detect_date := dmy_hm(detect_date)]

# make a join
dt[, .(Location = Station, Individual), by = .(dates = floor_date(detect_date, "30 minutes"))][b, on = "dates"]

Binning time series in R?

While you could convert to a formal time representation, in this case it might be easier to just use substr:

test <- c("00:00:01","02:07:01","22:30:15")
as.numeric(substr(test,1,2))
#[1]  0  2 22

Using a POSIXct time to deal with it would also work, and might be handy if you plan on further calculations (differences in time etc):

testtime <- as.POSIXct(test,format="%H:%M:%S")
#[1]"2013-12-09 00:00:01 EST" "2013-12-09 02:07:01 EST" "2013-12-09 22:30:15 EST"
as.numeric(format(testtime,"%H"))
#[1]  0  2 22

Given time column, how can I create time bins in R?

One way to do this is to use strptime to format your time column as POSIX objects, and then use format on those objects to round down to the hour like so:

library(dplyr)

df$hour <- format(strptime(df$time, "%H:%M"), "%H:00")

df %>% group_by(hour) %>% summarize(respond = sum(respond))

# # A tibble: 3 x 2
#    hour respond
#   <chr>   <int>
# 1 08:00       0
# 2 09:00       2
# 3 15:00       1

How to bin times from different days into time bins

If you want to bin by time-of-day, regardless of date, then it might be easier to extract just the time-of-day and work with that.

dat = data.frame(time=t, q=q)

library(lubridate)
library(plyr)

# Extract time of day from each date-time
dat$hour = hour(dat$time) + minute(dat$time)/60 + second(dat$time)/3600

# Create bin labels
bins=c(paste0(rep(c(paste0(0,0:9),10:23), each=4),":", c("00",15,30,45))[-1],"24:00")

# Bin the data
dat$bins = cut(dat$hour, breaks=seq(0, 24, 0.25), labels=bins)

And here's the result of summarizing by time bin:

ddply(dat, .(bins), summarise, q_sum = sum(q), .drop=FALSE)

    bins q_sum
1  00:15     0
2  00:30     0
3  00:45     0
4  01:00     0
5  01:15   100
6  01:30     0
...
10 02:30     0
11 02:45   100
12 03:00     0
...
27 06:45     0
28 07:00   100
29 07:15     0
30 07:30     0
31 07:45     0
32 08:00     0
33 08:15   100
34 08:30     0
...
52 13:00     0
53 13:15   100
54 13:30     0
55 13:45     0
...
72 18:00     0
73 18:15     0
74 18:30   200
75 18:45     0
...
82 20:30     0
83 20:45     0
84 21:00   100
85 21:15     0
86 21:30     0
...
95 23:45     0
96 24:00     0

How to create time bins in R and group data

This routine can be implemented with {dplyr} group_by mutate and summarize. I split it up into two result objects res1 and res2

dat <- read.table(text="trial   event   time_start  time_end    time_duration   region
1         A       36403      36504        101           none
1         B       36506      36516        10            none
1         A       36518      36700        182           top
1         B       36702      36708        6             none
1         A       36710      37054        344           top
1         B       37056      37088        32            none
1         A       37090      37640        550           right
1         B       37642      37678        36            none
1         A       37680      37812        132           left
2         A       41278      41318        40            top
2         B       41320      41336        16            none
2         A       41338      41490        152           top
2         B       41492      41498        6             none
2         A       41500      41994        494           top
2         B       41996      42032        36            none
2         A       42034      42492        458           left", header=TRUE)

library(dplyr, warn.conflicts = FALSE)

res1 <- dat %>% 
  group_by(trial) %>%
  mutate(duration = time_end - time_start,
         total_duration = sum(duration),
         cml_duration = cumsum(duration),
         fractime = cml_duration / total_duration,
         bin = floor(fractime / 0.25 + 0.99)) 
                                         # 0.99 < 1 : fudge factor for group 1:4 not 0:4 or 1:5
res2 <- res1 %>% 
  group_by(trial, bin) %>%
  summarize(total_event_a = sum(event == "A"), total_event_a_right = sum(event == "A" & region == "right"))
#> `summarise()` regrouping output by 'trial' (override with `.groups` argument)

res2
#> # A tibble: 6 x 4
#> # Groups:   trial [2]
#>   trial   bin total_event_a total_event_a_right
#>   <int> <dbl>         <int>               <int>
#> 1     1     1             2                   0
#> 2     1     2             1                   0
#> 3     1     4             2                   1
#> 4     2     1             2                   0
#> 5     2     3             1                   0
#> 6     2     4             1                   0

^{Created on 2020-12-06 by the reprex package (v0.3.0)}

R - Split time series into time-only bins

Drop the date and deal only with the time component?

format(tt, "%H:%M:%S")

extracts the time component into a string, but it can be modified to further convert to any format your binning code handles. Alternatively, make the date the same prior to binning.

Binning Time Data in R