Referencing current row in FILTER clause of window function
You are not actually aggregating rows, so the new aggregate FILTER
clause is not the right tool. A window function is more like it, a problem remains, however: the frame definition of a window cannot depend on values of the current row. It can only count a given number of rows preceding or following with the ROWS
clause.
To make that work, aggregate counts per day and LEFT JOIN
to a full set of days in range. Then you can apply a window function:
SELECT t.*, ct.ct_last4days
FROM (
SELECT *, sum(ct) OVER (ORDER BY dt ROWS 3 PRECEDING) AS ct_last4days
FROM (
SELECT generate_series(min(dt), max(dt), interval '1 day')::date AS dt
FROM tbl t1
) d
LEFT JOIN (SELECT dt, count(*) AS ct FROM tbl GROUP BY 1) t USING (dt)
) ct
JOIN tbl t USING (dt);
Omitting ORDER BY dt
in the widow frame definition usually works, since the order is carried over from generate_series()
in the subquery. But there are no guarantees in the SQL standard without explicit ORDER BY
and it might break in more complex queries.
SQL Fiddle.
Related:
- Select finishes where athlete didn't finish first for the past 3 events
- PostgreSQL: running count of rows for a query 'by minute'
- PostgreSQL unnest() with element number
Window functions filter through current row
Can I use a frame and a filter?
You can. But either has restrictions:
The expression in the
FILTER
clause only sees the respective row where it fetches values. There is no way to reference the row for which your window function computes values. So I don't see a way to formulate a filter depending on that row unless we make a huge, expensive cross join - the same row is used for many different computations. Or we are back toLATERAL
subqueries that can reference the parent row.The frame definition on the other hand does not allow variables at all. It demands a fixed number, as discussed in the related answer you referenced:
- Referencing current row in FILTER clause of window function
These restrictions make your particular query hard to implement. This should be correct now:
SELECT *
FROM (
SELECT record_id, security_id, date, price
, CASE WHEN do_calc THEN max(earnings) OVER w1 END AS peak_earnings
, CASE WHEN do_calc THEN min(earnings) OVER w1 END AS minimum_earnings
, CASE WHEN do_calc THEN price / NULLIF(max(earnings) OVER w1, 0) END AS price_to_peak_earnings
, CASE WHEN do_calc THEN price / NULLIF(min(earnings) OVER w1, 0) END AS price_to_minimum_earnings
FROM (
SELECT *, (date - 365) >= min_date AND s.record_id IS NOT NULL AS do_calc
FROM (
SELECT security_id, min_date
, generate_series(min_date, max_date, interval '1 day')::date AS date
FROM (
SELECT security_id, min(date) AS min_date, max(date) AS max_date
FROM security_data
GROUP BY 1
) minmax
) d
LEFT JOIN security_data s USING (security_id, date)
) sub1
WINDOW w1 AS (PARTITION BY security_id ORDER BY date ROWS BETWEEN 365 PRECEDING AND 1 PRECEDING)
) sub2
WHERE record_id IS NOT NULL
ORDER BY 1, 2;
SQL Fiddle.
Notes
Nothing in the question says that every
security_id
would have rows for the same days. Calculating min / max date persecurity_id
in subqueryminmax
give us the minimum time frame.The time frame for calculations is exactly 365 day preceding the current date of the row and not including the current row (
ROWS BETWEEN 365 PRECEDING AND 1 PRECEDING
). It's typically more useful to exclude the current row from aggregations to be compared with the current row.
I adapted the condition for calculations to the same time frame to avoid corner case oddities:(date - 365) >= min_date
In the fiddle, where you added 1 row for every 1st of Jan, you can see the effect of leapyears contrasting with a fixed number of 365 day. The window frame is empty after leapyears (2001, 2005, ...).
I am using all subqueries, which is typically a bit faster than CTEs.
To be sure, we need to include
ORDER BY
in the frame definition. I updated my old answer you linked to accordingly:- Referencing current row in FILTER clause of window function
I use
w1
as window name, for the "1 year" period. You might addw2
, etc. and can have any number of days for each. You could adapt to leapyears after all if you should need to. Might even generate the whole query depending on the current date ...
Filter clause in aggregate window not discarding rows as expected
The over clause has precedence over the filter clause. So you take last_2 (i.e. the current row and the previous to it) and from these you filter, which gets you only one row (the even one).
What you are looking for instead is this:
sum(case when num % 2 = 0 then num else 0 end) over last_2
Window running function except current row
Yes, you can. This does the trick:
with
t(i,x,y) as (
values
(1,1,1),(2,1,3),(3,1,2),
(4,2,4),(5,2,2),(6,2,8)
)
select
t.*,
sum(y) over w as sum,
max(y) over w as max,
count(*) filter (where y > 2) over w as cnt
from t
window w as (partition by x order by i
rows between unbounded preceding and 1 preceding);
The frame_clause
selects just those rows from the window frame that you are interested in.
Note that in the sum
column you'll get null
rather than 0
because of the frame clause: the first row in the frame has no row before it. You can coalesce()
this away if needed.
SQLFiddle
Related Topics
SQL Error: Ora-02291: Integrity Constraint
Postgres Interval Using Value from Table
Return a Query from a Function
How to Give an Alias to a Table in Oracle
Rails Way to Reset Seed on Id Field
Case .. When Expression in Oracle SQL
Add Unique Constraint in SQL Server 2008 Gui
How to Use Like with Column Name
Sqlite Database - Select the Data Between Two Dates
Get All Dates in Date Range in SQL Server
Select Distinct Is Slower Than Expected on My Table in Postgresql
When to Open and Close Brackets Surrounding Joins in Ms Access SQL
Order by with Inner Query, Giving Ora-00907 Missing Right Parenthesis
How to Create Simple Fuzzy Search with Postgresql Only