Truncate timestamp to arbitrary intervals
Consider this demo to bring timestamps down to a resolution of 15 minutes and aggregate resulting dupes:
WITH tbl(id, ts) AS ( VALUES
(1::int, '2012-10-04 00:00:00'::timestamp)
,(2, '2012-10-04 18:23:01')
,(3, '2012-10-04 18:30:00')
,(4, '2012-10-04 18:52:33')
,(5, '2012-10-04 18:55:01')
,(6, '2012-10-04 18:59:59')
,(7, '2012-10-05 11:01:01')
)
SELECT to_timestamp((extract(epoch FROM ts)::bigint / 900)*900)::timestamp
AS lower_bound
, to_timestamp(avg(extract(epoch FROM ts)))::timestamp AS avg_ts
, count(*) AS ct
FROM tbl
GROUP BY 1
ORDER BY 1;
Result:
lower_bound | avg_ts | ct
---------------------+---------------------+----
2012-10-04 00:00:00 | 2012-10-04 00:00:00 | 1
2012-10-04 18:15:00 | 2012-10-04 18:23:01 | 1
2012-10-04 18:30:00 | 2012-10-04 18:30:00 | 1
2012-10-04 18:45:00 | 2012-10-04 18:55:51 | 3
2012-10-05 11:00:00 | 2012-10-05 11:01:01 | 1
The trick is to extract a unix epoch like @Michael already posted. Integer division lumps them together in buckets of the chosen resolution, because fractional digits are truncated.
I divide by 900, because 15 minutes = 900 seconds.
Multiply by the same number to get the resulting lower_bound
.
Convert the unix epoch back to a timestamp with to_timestamp()
.
This works great for intervals that can be represented without fractional digits in the decimal system. For even more versatility use the often overlooked function width_bucket()
like I demonstrate in this recent, closely related answer. More explanation, links and an sqlfiddle demo over there.
postgresql date_trunc to arbitrary precision?
There is no function you want, but as said in postgresql wiki you can define function for youself:
CREATE OR REPLACE FUNCTION round_time_10m(TIMESTAMP WITH TIME ZONE)
RETURNS TIMESTAMP WITH TIME ZONE AS $$
SELECT date_trunc('hour', $1) + INTERVAL '10 min' * ROUND(date_part('minute', $1) / 10.0)
$$ LANGUAGE SQL;
Generally rounding up to $2 minutes:
CREATE OR REPLACE FUNCTION round_time_nm(TIMESTAMP WITH TIME ZONE, INTEGER)
RETURNS TIMESTAMP WITH TIME ZONE AS $$
SELECT date_trunc('hour', $1) + ($2 || ' min')::INTERVAL * ROUND(date_part('minute', $1) / $2)
$$ LANGUAGE SQL;
Extract 30 minutes from timestamp and group it by 30 mins time interval -PGSQL
You can change the column on which you're aggregating to use the minute too:
select
count(*) as logged_users,
CONCAT(EXTRACT(hour from login_time::timestamp), '-', CASE WHEN EXTRACT(minute from login_time::timestamp) < 30 THEN 0 ELSE 30 END) as HalfHour
from loginhistory
where login_time::date = '2021-04-21'
group by HalfHour
order by HalfHour;
calculating average with grouping based on time intervals
Simple and fast solution for this particular example:
SELECT date_trunc('minute', ts) AS minute
, sum(speed)/6 AS avg_speed
FROM speed_table AS t
WHERE ts >= '2014-06-21 0:0'
AND ts < '2014-06-20 0:0' -- exclude dangling corner case
AND condition2 = 'something'
GROUP BY 1
ORDER BY 1;
You need to factor in missing rows as "0 speed". Since a minute has 6 samples, just sum and divide by 6. Missing rows evaluate to 0
implicitly.
This returns no row for minutes with no rows at all.avg_speed
for missing result rows is 0
.
General query for arbitrary intervals
Works for all any interval listed in the manual for date_trunc()
:
SELECT date_trunc('minute', g.ts) AS ts_start
, avg(COALESCE(speed, 0)) AS avg_speed
FROM (SELECT generate_series('2014-06-21 0:0'::timestamp
, '2014-06-22 0:0'::timestamp
, '10 sec'::interval) AS ts) g
LEFT JOIN speed_table t USING (ts)
WHERE (t.condition2 = 'something' OR
t.condition2 IS NULL) -- depends on actual condition!
AND g.ts <> '2014-06-22 0:0'::timestamp -- exclude dangling corner case
GROUP BY 1
ORDER BY 1;
The problematic part is the additional unknown condition. You would need to define that. And decide whether missing rows supplied by generate_series should pass the test or not (which can be tricky!).
I let them pass in my example (and all other rows with a NULL values).
Compare:
PostgreSQL: running count of rows for a query 'by minute'
Arbitrary intervals:
Truncate timestamp to arbitrary intervals
For completely arbitrary intervals consider @Clodoaldo's math based on epoch values or use the often overlooked function width_bucket()
. Example:
Aggregating (x,y) coordinate point clouds in PostgreSQL
Aggregating (x,y) coordinate point clouds in PostgreSQL
custom DATE_TRUNC timeframes
A little painful, but you can do:
select (date_trunc('day', user.created_at) +
floor(extract(hour from user.created_at) / 3) * interval '3 hour'
)
How to round timestamp to 10 minutes in Spark 3.0?
Convert the timestamp into seconds using unix_timestamp
function, then perform the rounding by dividing by 600
(10 minutes), round the result of division and multiply by 600 again:
val df = Seq(
("2022-01-21 22:11:11"),
("2022-01-21 22:04:04"),
("2022-01-21 22:19:34"),
("2022-01-21 22:57:14")
).toDF("my_col").withColumn("my_col", to_timestamp($"my_col"))
df.withColumn(
"my_col_rounded",
from_unixtime(round(unix_timestamp($"my_col") / 600) * 600)
).show
//+-------------------+-------------------+
//|my_col |my_col_rounded |
//+-------------------+-------------------+
//|2022-01-21 22:11:11|2022-01-21 22:10:00|
//|2022-01-21 22:04:04|2022-01-21 22:00:00|
//|2022-01-21 22:19:34|2022-01-21 22:20:00|
//|2022-01-21 22:57:14|2022-01-21 23:00:00|
//+-------------------+-------------------+
You can also truncate the original timestamp to hours, get the minutes that your round to 10 and add them to truncated timestamp using interval:
df.withColumn(
"my_col_rounded",
date_trunc("hour", $"my_col") + format_string(
"interval %s minute",
expr("round(extract(MINUTE FROM my_col)/10.0)*10")
).cast("interval")
)
How to round to nearest X minutes with PL/pgSQL?
Instead of adding or subtracting
_minutes * interval '1 minute'
you should be subtracting
(_minutes % _nearest) * interval '1 minute'
or adding
(_nearest - (_minutes % _nearest)) * interval '1 minute'
Related Topics
Re-Writing "Fuzzy Join" Functions from R to SQL
Unexpected Results from SQL Query with Between Timestamps
Count(Id) VS. Count(*) in MySQL
Sp_Msforeachdb: Only Include Results from Databases with Results
Pivot Dynamically, Returned Results from Join of Two Tables
Split Function by Comma in SQL Server 2008
Pls-00201: Identifier 'User Input' Must Be Declared
How to Check for Is Not Null and Is Not Empty String in SQL Server
Rails Way to Reset Seed on Id Field
SQL Server Table Creation Date Query
Generate Insert SQL Statements from a CSV File
SQL Server Using Wildcard Within In
Need a Row Count After Select Statement: What's the Optimal SQL Approach
How to Insert New Row to Database with Auto_Increment Column Without Specifying Column Names