PostgreSQL reusing computation result in select query
This could be an alternative you might use:
SELECT foo.c
FROM (
SELECT (a+b) as c FROM table
) as foo
WHERE foo.c < 5
AND (foo.c*foo.c+t) > 100
From a performance point of view, I think it's not an optimal solution (because of the lack of WHERE clause of foo subquery, hence returning all table records). I don't know if Postgresql does some query optimization there.
PostgreSQL reusing value from long calculation in CASE statement
First of all I guess that query optimizer is smart enough to spot the same deterministic expressions and do not calculate it twice.
If this is not applicable you could use LATERAL
:
SELECT *,
CASE column1
WHEN sub.long_calc THEN 10
ELSE sub.long_calc + 2 * 3.14
END AS mycalc
FROM tab t
,LATERAL (VALUES(t.a+t.b+t.c)) AS sub(long_calc);
SqlFiddleDemo
Output:
╔═════╦══════════╦════╦════╦════╦════════════╦════════╗
║ id ║ column1 ║ a ║ b ║ c ║ long_calc ║ mycalc ║
╠═════╬══════════╬════╬════╬════╬════════════╬════════╣
║ 1 ║ 6 ║ 1 ║ 2 ║ 3 ║ 6 ║ 10 ║
║ 2 ║ 20 ║ 2 ║ 3 ║ 4 ║ 9 ║ 15.28 ║
╚═════╩══════════╩════╩════╩════╩════════════╩════════╝
You could replace VALUES
with simple SELECT
or function call:
-- any query
,LATERAL (SELECT t.a+t.b+t.c) AS sub(long_calc)
-- function
,LATERAL random() AS sub(long_calc)
-- function with parameter passing
,LATERAL sin(t.a) AS sub(long_calc)
SqlFiddleDemo2
EDIT:
SELECT id
,sub2.long_calc_rand -- calculated once
,random() AS rand -- calculated every time
FROM tab t
,LATERAL random() AS sub2(long_calc_rand);
SqlFiddleDemo3
Output:
╔═════╦═════════════════════╦════════════════════╗
║ id ║ long_calc_rand ║ rand ║
╠═════╬═════════════════════╬════════════════════╣
║ 1 ║ 0.3426254219375551 ║ 0.8861959744244814 ║
║ 2 ║ 0.3426254219375551 ║ 0.8792812027968466 ║
║ 3 ║ 0.3426254219375551 ║ 0.8123061805963516 ║
╚═════╩═════════════════════╩════════════════════╝
Reuse computed select value
Test timing
You don't see the evaluation of individual functions per row in the EXPLAIN
output.
Test with EXPLAIN ANALYZE
to get actual query times to compare overall effectiveness. Run a couple of times to rule out caching artifacts. For simple queries like this, you get more reliable numbers for the total runtime with:
EXPLAIN (ANALYZE, TIMING OFF) SELECT ...
Requires Postgres 9.2+. Per documentation:
TIMING
Include actual startup time and time spent in each node in the output. The overhead of repeatedly reading the system clock can slow
down the query significantly on some systems, so it may be useful to
set this parameter toFALSE
when only actual row counts, and not exact
times, are needed. Run time of the entire statement is always
measured, even when node-level timing is turned off with this option.
This parameter may only be used whenANALYZE
is also enabled. It
defaults toTRUE
.
Prevent repeated evaluation
Normally, expressions in a subquery are evaluated once. But Postgres can collapse trivial subqueries if it thinks that will be faster.
To introduce an optimization barrier, you could use a CTE instead of the subquery. This guarantees that Postgres computes ST_SnapToGrid(geom, 50)
once only:
WITH cte AS (
SELECT ST_SnapToGrid(geom, 50) AS geom1
FROM points
)
SELECT COUNT(*) AS n
, ST_X(geom1) AS x
, ST_Y(geom1) AS y
FROM cte
GROUP BY geom1; -- see below
However, this it's probably slower than a subquery due to more overhead for a CTE. The function call is probably very cheap. Generally, Postgres knows better how to optimize a query plan. Only introduce such an optimization barrier if you know better.
Simplify
I changed the name of the computed point in the subquery / CTE to geom1
to clarify it's different from the original geom
. That helps to clarify the more important thing here:
GROUP BY geom1
instead of:
GROUP BY x, y
That's obviously cheaper - and may have an influence on whether the function call is repeated. So, this is probably fastest:
SELECT COUNT(*) AS n
, ST_X(ST_SnapToGrid(geom, 50)) AS x
, ST_y(ST_SnapToGrid(geom, 50)) AS y
FROM points
GROUP BY ST_SnapToGrid(geom, 50); -- same here!
Or maybe this:
SELECT COUNT(*) AS n
, ST_X(geom1) AS x
, ST_y(geom1) AS y
FROM (
SELECT ST_SnapToGrid(geom, 50) AS geom1
FROM points
) AS tmp
GROUP BY geom1;
Test all three with EXPLAIN ANALYZE
or EXPLAIN (ANALYZE, TIMING OFF)
and see for yourself. Testing >> guessing.
How to reuse a result column in an expression for another result column
Like so:
SELECT
turnover,
cost,
turnover - cost as profit
from (
(SELECT SUM(...) FROM ...) as turnover,
(SELECT SUM(...) FROM ...) as cost
) as partial_sums
SQL : How to reuse count(*) computed value?
Not in SQL Server, you would have to use one of these:
SELECT date_part('year'::text, c.date) AS yyyy,
to_char(c.date, 'MM'::text) AS monthnumber,
to_char(c.date, 'TMMonth'::text) AS monthname,
l.id AS lineID,
n.id AS networkID,
l.name AS lineName,
count(c.*) AS count,
count(distinct(c.date)) AS number_of_journeys,
count(c.*) / count(distinct(c.date)) AS frequentation_moyenne
OR
Select yyyy, monthnumber, monthname, lineID, networkID, lineName, count, number_of_journery, count / number_of_journeys AS frequentation_moyenne
from
(SELECT date_part('year'::text, c.date) AS yyyy,
to_char(c.date, 'MM'::text) AS monthnumber,
to_char(c.date, 'TMMonth'::text) AS monthname,
l.id AS lineID,
n.id AS networkID,
l.name AS lineName,
count(c.*) AS count,
count(distinct(c.date)) AS number_of_journeys)
Is it possible to reuse scalar result from a single subquery in insert query in Postgres?
You can do use insert . . . select
, basically moving the VALUES()
into the FROM
clause:
INSERT INTO my_table (col1, col2, computed_col)
SELECT v.col1, v.col2, x.some_col || v.computed
FROM (SELECT some_col FROM some_table WHERE id = :id
) x CROSS JOIN
(VALUES (:col1Val1, :col2val1, ARRAY[:computed_col1]::bigint[]),
(:col1Val2, :col2val2, ARRAY[:computed_col2]::bigint[])
) v(col1, col2, computed);
How to re-use result for SELECT, WHERE and ORDER BY clauses?
In the GROUP BY
and ORDER BY
clause you can refer to column aliases (output columns) or even ordinal numbers of SELECT
list items. I quote the manual on ORDER BY
:
Each expression can be the name or ordinal number of an output column
(SELECT list item), or it can be an arbitrary expression formed from
input-column values.
Bold emphasis mine.
But in the WHERE
and HAVING
clauses, you can only refer to columns from the base tables (input columns), so you have to spell out your function call.
SELECT *, earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) AS dist
FROM venues
WHERE earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) <= radius
ORDER BY distance;
If you want to know if it's faster to pack the calculation into a CTE or subquery, just test it with EXPLAIN ANALYZE
. (I doubt it.)
SELECT *
FROM (
SELECT *
,earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) AS dist
FROM venues
) x
WHERE distance <= radius
ORDER BY distance;
Like @Mike commented, by declaring a function STABLE
(or IMMUTABLE
) you inform the query planner that results from a function call can be reused multiple times for identical calls within a single statement. I quote the manual here:
A STABLE function cannot modify the database and is guaranteed to
return the same results given the same arguments for all rows within a
single statement. This category allows the optimizer to optimize
multiple calls of the function to a single call.
Bold emphasis mine.
Related Topics
Create a Unique Index on a Non-Unique Column
Sql*Plus Does Not Execute SQL Scripts That SQL Developer Does
On Delete Cascade for Self-Referencing Table
Create View Must Be the Only Statement in the Batch
Hibernate Create Criteria to Join the Same Table Twice - Tried 2 Approach with 2 Difference Error
Select Top N Records Ordered by X, But Have Results in Reverse Order
Access: Create Table If It Does Not Exist
How to Find If a Value Exists Within a Varray
Get Value Between 2Nd and 3Rd Comma
How to Pivot Rows to Columns in MySQL Without Using Case
SQL Query on Multiple Databases
Calculate Missing Date Ranges and Overlapping Date Ranges Between Two Dates
Using Object_Id() Function with #Tables
How to Parse a Varchar Passed to a Stored Procedure in SQL Server
Dynamic Column in Select Statement Postgres
Find Only Capital Letters in Word Through in SQL Server Query