Moving Average Based on Timestamps in Postgresql

Moving Average based on Timestamps in PostgreSQL

Assuming you want to restart the rolling average after each 15 minute interval:

select id, 
temp,
avg(temp) over (partition by group_nr order by time_read) as rolling_avg
from (
select id,
temp,
time_read,
interval_group,
id - row_number() over (partition by interval_group order by time_read) as group_nr
from (
select id,
time_read,
'epoch'::timestamp + '900 seconds'::interval * (extract(epoch from time_read)::int4 / 900) as interval_group,
temp
from readings
) t1
) t2
order by time_read;

It is based on Depesz's solution to group by "time ranges":

Here is an SQLFiddle example: http://sqlfiddle.com/#!1/0f3f0/2

Is there an easy way to calculate 12 months moving average in PostgreSQL?

If anybody is looking for a solution to calculate 1,2,3,4,..6,...12 years/quarters/months/weeks/days/hours moving average, median, percentiles, etc. summary stats in a single go, here is the answer:

WITH grid AS (
SELECT end_time, start_time
FROM (

SELECT end_time
, lag(end_time, 12, 'infinity') OVER (ORDER BY end_time) AS start_time
FROM (

SELECT
generate_series(date_trunc('month', min(time2))
, date_trunc('month', max(time2)) + interval '1 month', interval '1 month') AS end_time
FROM my_table

) sub

) sub2

WHERE end_time > start_time

)

SELECT
to_char(date_trunc('month',a.end_time - interval '1 month'), 'YYYY-MM') as d
, count(e.time2)
, percentile_cont(0.25) within group (order by e.price) as Q1
, percentile_cont(0.5) within group (order by e.price) as median
, percentile_cont(0.75) within group (order by e.price) as Q3
, avg(e.price) as Aver
, min(e.price) as Mi
, max(e.price) as Mx

FROM grid a

LEFT JOIN my_table e ON e.time2 >= a.start_time

AND e.time2 < a.end_time

GROUP BY end_time
ORDER BY d DESC

Note, the table contains a list of individual time records (like sales transactions, etc), as in the example presented in the actual question.

And this bit:

to_char(date_trunc('month',a.end_time - interval '1 month'), 'YYYY-MM') as d

is only for display purposes. That is, the convention in PostgreSQL is that "end of the month" is actually "0 hour" of the next month (.ie. end of Oct 2019 is "2019.11.01 at 00:00:00"). The same applies to any time range (e.g. end of 2019 is actually "2020.01.01 at 00:00:00"). So, if "- interval '1 month' " is not included, the 12 months moving stats ending October 2019 will be shown as "for" 1st November 2019 at 00:00:00 (truncated to 2019-11).

PostgreSQL Grouped Rolling Averages

The linked answer contains almost everything you need. If you want to "group" further (f.ex. by itemid), you'll just need to add those "groups" to the PARTITION BY clauses of the window functions:

select   *, avg(isup::int) over (partition by itemid, group_nr order by logged) as rolling_avg
from (
select *, id - row_number() over (partition by itemid, interval_group order by logged) as group_nr
from (
select *, 'epoch'::timestamp + '3600 seconds'::interval * (extract(epoch from logged)::int4 / 3600) as interval_group
from dummy
) t1
) t2
order by itemid, logged

Note however that this (and the linked answer) works only because id doesn't have gaps & is in order with the timestamp field of its table. If that's not the case, you'll need

row_number() over (partition by itemid order by logged) - row_number() over (partition by itemid, interval_group order by logged) as group_nr

instead of id - row_number() ....

http://rextester.com/YBSC43615

Also, if you're going to use only hourly groups, you can use:

date_trunc('hour', logged) as interval_group

instead of the more general arithmetic (as @LaurenzAlbe already noticed).

Unable to Calculate 7 Day Moving Average due to inconsistent dates

You can get a bit closer to a true 7 day moving average by using RANGE instead of ROWS for your range specification.

Read more about window function frames here.

I believe this should work for you:

select date, sales, 
avg(sales) over(order by date range between '6 days' preceding and current row)
from sales_info
order by date;

Here's a demonstration with made up data:

SELECT i, 
t,
avg(i) OVER (ORDER BY t RANGE BETWEEN '6 days' preceding and current row) FROM (
SELECT i, t
FROM generate_series('2021-01-01'::timestamp, '2021-02-01'::timestamp, '1 day') WITH ORDINALITY as g(t, i)
) sub;
i | t | avg
----+---------------------+------------------------
1 | 2021-01-01 00:00:00 | 1.00000000000000000000
2 | 2021-01-02 00:00:00 | 1.5000000000000000
3 | 2021-01-03 00:00:00 | 2.0000000000000000
4 | 2021-01-04 00:00:00 | 2.5000000000000000
5 | 2021-01-05 00:00:00 | 3.0000000000000000
6 | 2021-01-06 00:00:00 | 3.5000000000000000
7 | 2021-01-07 00:00:00 | 4.0000000000000000
8 | 2021-01-08 00:00:00 | 5.0000000000000000
9 | 2021-01-09 00:00:00 | 6.0000000000000000
10 | 2021-01-10 00:00:00 | 7.0000000000000000
11 | 2021-01-11 00:00:00 | 8.0000000000000000
12 | 2021-01-12 00:00:00 | 9.0000000000000000
13 | 2021-01-13 00:00:00 | 10.0000000000000000
14 | 2021-01-14 00:00:00 | 11.0000000000000000
15 | 2021-01-15 00:00:00 | 12.0000000000000000
16 | 2021-01-16 00:00:00 | 13.0000000000000000
17 | 2021-01-17 00:00:00 | 14.0000000000000000
18 | 2021-01-18 00:00:00 | 15.0000000000000000
19 | 2021-01-19 00:00:00 | 16.0000000000000000
20 | 2021-01-20 00:00:00 | 17.0000000000000000
21 | 2021-01-21 00:00:00 | 18.0000000000000000
22 | 2021-01-22 00:00:00 | 19.0000000000000000
23 | 2021-01-23 00:00:00 | 20.0000000000000000
24 | 2021-01-24 00:00:00 | 21.0000000000000000
25 | 2021-01-25 00:00:00 | 22.0000000000000000
26 | 2021-01-26 00:00:00 | 23.0000000000000000
27 | 2021-01-27 00:00:00 | 24.0000000000000000
28 | 2021-01-28 00:00:00 | 25.0000000000000000
29 | 2021-01-29 00:00:00 | 26.0000000000000000
30 | 2021-01-30 00:00:00 | 27.0000000000000000
31 | 2021-01-31 00:00:00 | 28.0000000000000000
32 | 2021-02-01 00:00:00 | 29.0000000000000000

How to get an average of timestamps? PostgreSQL

select to_timestamp(avg(timestamps)) "timestamps", avg(tank_level) "TankLevel"
from (
select row_number() over (order by id) as rn, tank_level, extract(epoch from timestamps) "timestamps"
from data_tanksensor
where sensors_on_site_id = 91
) s
group by (rn + ((Select count(*)/10 From data_tanksensor where sensors_on_site_id = 91)-1))/ (Select count(*)/10 From data_tanksensor where sensors_on_site_id = 91)
order by timestamps asc
;

Figured it out
Thank you all for your examples

Rolling average postgres

I assume, that it will not be drastically slow to re-calculated latest 200 entries each time with proper indexing. If you'll do an index, like:

CREATE INDEX i_sensor_values ON sensor_values(sensor_id, ts DESC);

you'll be able to get results fairly quickly doing:

SELECT sum("value") -- add more expressions as required
FROM sensor_values
WHERE sensor_id=$1
ORDER BY ts DESC
LIMIT 200;

You can execute this query in a loop from PL/pgSQL function.
If you'll migrate to 9.3 (or higher) any time soon, you'll be able to also use LATERAL joins for this purpose.

I do not think a covering index will do a good thing here, as table is constantly changing and IndexOnlyScan will not kick in.

It is good to check Loose Index scans also.

P.S. Column name value should be double quoted, as this is an SQL reserved word.

How to average hourly values over multiple days with SQL

I resolved the problem by myself:
Solution:

SELECT EXTRACT(HOUR FROM table."Timestamp") as hour,
avg(table."Value") as average
from table
where Timestamp between '2021-02-10' and '2021-02-20'
group by hour
order by hour;


Related Topics



Leave a reply



Submit