Truncate Timestamp to Arbitrary Intervals

Truncate timestamp to arbitrary intervals

Consider this demo to bring timestamps down to a resolution of 15 minutes and aggregate resulting dupes:

WITH tbl(id, ts) AS ( VALUES
    (1::int, '2012-10-04 00:00:00'::timestamp)
   ,(2, '2012-10-04 18:23:01')
   ,(3, '2012-10-04 18:30:00')
   ,(4, '2012-10-04 18:52:33')
   ,(5, '2012-10-04 18:55:01')
   ,(6, '2012-10-04 18:59:59')
   ,(7, '2012-10-05 11:01:01')
   )
SELECT to_timestamp((extract(epoch FROM ts)::bigint / 900)*900)::timestamp
                                                            AS lower_bound
     , to_timestamp(avg(extract(epoch FROM ts)))::timestamp AS avg_ts
     , count(*) AS ct
FROM   tbl
GROUP  BY 1
ORDER  BY 1;

Result:

     lower_bound     |       avg_ts        | ct
---------------------+---------------------+----
 2012-10-04 00:00:00 | 2012-10-04 00:00:00 |  1
 2012-10-04 18:15:00 | 2012-10-04 18:23:01 |  1
 2012-10-04 18:30:00 | 2012-10-04 18:30:00 |  1
 2012-10-04 18:45:00 | 2012-10-04 18:55:51 |  3
 2012-10-05 11:00:00 | 2012-10-05 11:01:01 |  1

The trick is to extract a unix epoch like @Michael already posted. Integer division lumps them together in buckets of the chosen resolution, because fractional digits are truncated.

I divide by 900, because 15 minutes = 900 seconds.

Multiply by the same number to get the resulting lower_bound.
Convert the unix epoch back to a timestamp with to_timestamp().

This works great for intervals that can be represented without fractional digits in the decimal system. For even more versatility use the often overlooked function width_bucket() like I demonstrate in this recent, closely related answer. More explanation, links and an sqlfiddle demo over there.

postgresql date_trunc to arbitrary precision?

There is no function you want, but as said in postgresql wiki you can define function for youself:

CREATE OR REPLACE FUNCTION round_time_10m(TIMESTAMP WITH TIME ZONE) 
RETURNS TIMESTAMP WITH TIME ZONE AS $$ 
  SELECT date_trunc('hour', $1) + INTERVAL '10 min' * ROUND(date_part('minute', $1) / 10.0) 
$$ LANGUAGE SQL;

Generally rounding up to $2 minutes:

CREATE OR REPLACE FUNCTION round_time_nm(TIMESTAMP WITH TIME ZONE, INTEGER) 
RETURNS TIMESTAMP WITH TIME ZONE AS $$ 
  SELECT date_trunc('hour', $1) + ($2 || ' min')::INTERVAL * ROUND(date_part('minute', $1) / $2) 
$$ LANGUAGE SQL;

Extract 30 minutes from timestamp and group it by 30 mins time interval -PGSQL

You can change the column on which you're aggregating to use the minute too:

select
  count(*) as logged_users,
  CONCAT(EXTRACT(hour from login_time::timestamp), '-', CASE WHEN EXTRACT(minute from login_time::timestamp) < 30 THEN 0 ELSE 30 END) as HalfHour 
from loginhistory 
where login_time::date = '2021-04-21' 
group by HalfHour
order by HalfHour;

calculating average with grouping based on time intervals

Simple and fast solution for this particular example:

SELECT date_trunc('minute', ts) AS minute
     , sum(speed)/6 AS avg_speed
FROM   speed_table AS t 
WHERE  ts >= '2014-06-21 0:0'
AND    ts <  '2014-06-20 0:0'  -- exclude dangling corner case
AND    condition2 = 'something'
GROUP  BY 1
ORDER  BY 1;

You need to factor in missing rows as "0 speed". Since a minute has 6 samples, just sum and divide by 6. Missing rows evaluate to 0 implicitly.

This returns no row for minutes with no rows at all.avg_speed for missing result rows is 0.

General query for arbitrary intervals

Works for all any interval listed in the manual for date_trunc():

SELECT date_trunc('minute', g.ts) AS ts_start
     , avg(COALESCE(speed, 0))    AS avg_speed
FROM  (SELECT generate_series('2014-06-21 0:0'::timestamp
                            , '2014-06-22 0:0'::timestamp
                            , '10 sec'::interval) AS ts) g
LEFT   JOIN speed_table t USING (ts)
WHERE (t.condition2 = 'something' OR
       t.condition2 IS NULL)                -- depends on actual condition!
AND    g.ts <> '2014-06-22 0:0'::timestamp  -- exclude dangling corner case
GROUP  BY 1
ORDER  BY 1;

The problematic part is the additional unknown condition. You would need to define that. And decide whether missing rows supplied by generate_series should pass the test or not (which can be tricky!).

I let them pass in my example (and all other rows with a NULL values).

Compare:

PostgreSQL: running count of rows for a query 'by minute'

Arbitrary intervals:

Truncate timestamp to arbitrary intervals

For completely arbitrary intervals consider @Clodoaldo's math based on epoch values or use the often overlooked function width_bucket(). Example:

Aggregating (x,y) coordinate point clouds in PostgreSQL
Aggregating (x,y) coordinate point clouds in PostgreSQL

custom DATE_TRUNC timeframes

A little painful, but you can do:

select (date_trunc('day', user.created_at) + 
        floor(extract(hour from user.created_at) / 3) * interval '3 hour'
       )

How to round timestamp to 10 minutes in Spark 3.0?

Convert the timestamp into seconds using unix_timestamp function, then perform the rounding by dividing by 600 (10 minutes), round the result of division and multiply by 600 again:

val df = Seq(
  ("2022-01-21 22:11:11"),
  ("2022-01-21 22:04:04"),
  ("2022-01-21 22:19:34"),
  ("2022-01-21 22:57:14")
).toDF("my_col").withColumn("my_col", to_timestamp($"my_col"))

df.withColumn(
  "my_col_rounded",
  from_unixtime(round(unix_timestamp($"my_col") / 600) * 600)
).show

//+-------------------+-------------------+
//|my_col             |my_col_rounded     |
//+-------------------+-------------------+
//|2022-01-21 22:11:11|2022-01-21 22:10:00|
//|2022-01-21 22:04:04|2022-01-21 22:00:00|
//|2022-01-21 22:19:34|2022-01-21 22:20:00|
//|2022-01-21 22:57:14|2022-01-21 23:00:00|
//+-------------------+-------------------+

You can also truncate the original timestamp to hours, get the minutes that your round to 10 and add them to truncated timestamp using interval:

df.withColumn(
  "my_col_rounded",
  date_trunc("hour", $"my_col") + format_string(
    "interval %s minute",
    expr("round(extract(MINUTE FROM my_col)/10.0)*10")
  ).cast("interval")
)

How to round to nearest X minutes with PL/pgSQL?

Instead of adding or subtracting

_minutes * interval '1 minute'

you should be subtracting

(_minutes % _nearest) * interval '1 minute'

or adding

(_nearest - (_minutes % _nearest)) * interval '1 minute'

Truncate Timestamp to Arbitrary Intervals