Binning time data in R
Just use ?cut
as it has a method for ?cut.POSIXt
date/times. E.g.:
x <- as.POSIXct("2016-01-01 00:00:00", tz="UTC") + as.difftime(30*(0:47),units="mins")
cut(x, breaks="2 hours", labels=FALSE)
# or to show more clearly the results:
data.frame(x, cuts = cut(x, breaks="2 hours", labels=FALSE))
# x cuts
#1 2016-01-01 00:00:00 1
#2 2016-01-01 00:30:00 1
#3 2016-01-01 01:00:00 1
#4 2016-01-01 01:30:00 1
#5 2016-01-01 02:00:00 2
#6 2016-01-01 02:30:00 2
#7 2016-01-01 03:00:00 2
#8 2016-01-01 03:30:00 2
#9 2016-01-01 04:00:00 3
#10 2016-01-01 04:30:00 3
# ...
If your data are just strings, then you need to do a conversion first. Times will end up assigned to the current day if you don't specify a particular day as well.
as.POSIXct("17:23:54", format="%H:%M:%S", tz="UTC")
#[1] "2016-07-13 17:23:54 UTC"
Binning data by time in R
You can use floor_date
to round down the Time
for each minute and take sum
in each group.
library(dplyr)
library(lubridate)
df %>%
mutate(Time = ymd_hms(Time)) %>%
group_by(ID, Time = floor_date(Time, "1 min")) %>%
summarise(Data = sum(Data))
Create time bins and assign data to correct bin
I tried to solve this using data.table
and lubridate
and sticking to my idea of using floor_date
.
# load packages
library(data.table)
library(lubridate)
# define a vector evenly spaced each 30 minutes:
b <- data.table(dates = seq(as.POSIXct("2018-03-25", tz = "UTC"),
as.POSIXct("2018-03-26", tz = "UTC"),
by = "30 min"))
# reproduce data
dt <- data.table(detect_date = as.character(c("25/03/2018 00:09", "25/03/2018 01:17", "25/03/2016 14:37", "25/03/2016 23:43")),
Station = c("SS01", "SS03", "SS04", "SS04"),
Individual = c("A", "B", "C", "B"))
# convert detect_date to date format
dt[, detect_date := dmy_hm(detect_date)]
# make a join
dt[, .(Location = Station, Individual), by = .(dates = floor_date(detect_date, "30 minutes"))][b, on = "dates"]
Binning time series in R?
While you could convert to a formal time representation, in this case it might be easier to just use substr
:
test <- c("00:00:01","02:07:01","22:30:15")
as.numeric(substr(test,1,2))
#[1] 0 2 22
Using a POSIXct
time to deal with it would also work, and might be handy if you plan on further calculations (differences in time etc):
testtime <- as.POSIXct(test,format="%H:%M:%S")
#[1]"2013-12-09 00:00:01 EST" "2013-12-09 02:07:01 EST" "2013-12-09 22:30:15 EST"
as.numeric(format(testtime,"%H"))
#[1] 0 2 22
Given time column, how can I create time bins in R?
One way to do this is to use strptime
to format your time
column as POSIX objects, and then use format
on those objects to round down to the hour like so:
library(dplyr)
df$hour <- format(strptime(df$time, "%H:%M"), "%H:00")
df %>% group_by(hour) %>% summarize(respond = sum(respond))
# # A tibble: 3 x 2
# hour respond
# <chr> <int>
# 1 08:00 0
# 2 09:00 2
# 3 15:00 1
How to bin times from different days into time bins
If you want to bin by time-of-day, regardless of date, then it might be easier to extract just the time-of-day and work with that.
dat = data.frame(time=t, q=q)
library(lubridate)
library(plyr)
# Extract time of day from each date-time
dat$hour = hour(dat$time) + minute(dat$time)/60 + second(dat$time)/3600
# Create bin labels
bins=c(paste0(rep(c(paste0(0,0:9),10:23), each=4),":", c("00",15,30,45))[-1],"24:00")
# Bin the data
dat$bins = cut(dat$hour, breaks=seq(0, 24, 0.25), labels=bins)
And here's the result of summarizing by time bin:
ddply(dat, .(bins), summarise, q_sum = sum(q), .drop=FALSE)
bins q_sum
1 00:15 0
2 00:30 0
3 00:45 0
4 01:00 0
5 01:15 100
6 01:30 0
...
10 02:30 0
11 02:45 100
12 03:00 0
...
27 06:45 0
28 07:00 100
29 07:15 0
30 07:30 0
31 07:45 0
32 08:00 0
33 08:15 100
34 08:30 0
...
52 13:00 0
53 13:15 100
54 13:30 0
55 13:45 0
...
72 18:00 0
73 18:15 0
74 18:30 200
75 18:45 0
...
82 20:30 0
83 20:45 0
84 21:00 100
85 21:15 0
86 21:30 0
...
95 23:45 0
96 24:00 0
How to create time bins in R and group data
This routine can be implemented with {dplyr} group_by
mutate
and summarize
. I split it up into two result objects res1
and res2
dat <- read.table(text="trial event time_start time_end time_duration region
1 A 36403 36504 101 none
1 B 36506 36516 10 none
1 A 36518 36700 182 top
1 B 36702 36708 6 none
1 A 36710 37054 344 top
1 B 37056 37088 32 none
1 A 37090 37640 550 right
1 B 37642 37678 36 none
1 A 37680 37812 132 left
2 A 41278 41318 40 top
2 B 41320 41336 16 none
2 A 41338 41490 152 top
2 B 41492 41498 6 none
2 A 41500 41994 494 top
2 B 41996 42032 36 none
2 A 42034 42492 458 left", header=TRUE)
library(dplyr, warn.conflicts = FALSE)
res1 <- dat %>%
group_by(trial) %>%
mutate(duration = time_end - time_start,
total_duration = sum(duration),
cml_duration = cumsum(duration),
fractime = cml_duration / total_duration,
bin = floor(fractime / 0.25 + 0.99))
# 0.99 < 1 : fudge factor for group 1:4 not 0:4 or 1:5
res2 <- res1 %>%
group_by(trial, bin) %>%
summarize(total_event_a = sum(event == "A"), total_event_a_right = sum(event == "A" & region == "right"))
#> `summarise()` regrouping output by 'trial' (override with `.groups` argument)
res2
#> # A tibble: 6 x 4
#> # Groups: trial [2]
#> trial bin total_event_a total_event_a_right
#> <int> <dbl> <int> <int>
#> 1 1 1 2 0
#> 2 1 2 1 0
#> 3 1 4 2 1
#> 4 2 1 2 0
#> 5 2 3 1 0
#> 6 2 4 1 0
Created on 2020-12-06 by the reprex package (v0.3.0)
R - Split time series into time-only bins
Drop the date and deal only with the time component?
format(tt, "%H:%M:%S")
extracts the time component into a string, but it can be modified to further convert to any format your binning code handles. Alternatively, make the date the same prior to binning.
Related Topics
Using R Convert Data.Frame to Simple Vector
Skip Some Rows in Read.CSV in R
R Shiny Sliderinput with Restricted Range
Applying a Function to a Backreference Within Gsub in R
R X-Axis Date Labels Using Plot()
Deleting Specific Rows from a Data Frame
Identifying the Outliers in a Data Set in R
Rcpp Function to Select (And to Return) a Sub-Dataframe
Keeping Only Certain Rows of a Data Frame Based on a Set of Values
Handling Errors Before Warnings in Trycatch
Trouble Passing on an Argument to Function Within Own Function
How to Select Non-Numeric Columns Using Dplyr::Select_If
Convert a Printed Message into a Character Vector
Partially Color Histogram in R
Print to PDF File Using Grid.Table in R - Too Many Rows to Fit on One Page
R: Web Scraping Yahoo.Finance After 2019 Change
Got Message Unable to Load Shared Object Stats.So When R Starts
How to Add Shaded Confidence Intervals to Line Plot with Specified Values