Differencebetween Prepared Statements and SQL or Pl/Pgsql Functions, in Terms of Their Purpose

What is the difference between prepared statements and SQL or PL/pgSQL functions, in terms of their purpose?

All three "work the same" in that they execute the simple SQL statement:

INSERT INTO foo VALUES (3, 'ben');

The prepared statement is only good for a single prepared SQL statement (as the name suggests). And only DML commands. The manual:

Any SELECT, INSERT, UPDATE, DELETE, or VALUES statement.

The function can contain any number of statements. DML and DDL. Only SQL for SQL functions. Plus some non-SQL procedural elements in PL/pgSQL.

The prepared statement is only visible inside the same session and gone at the end of the session, while the function persists and is visible to all - still only usable for those with the EXECUTE privilege.

The prepared statement is encumbered with the least overhead. (Not much difference.)

The SQL function is the only one of the three that cannot save the query plan (by itself). Read details about plan caching in PL/pgSQL functions in the manual here.

The SQL function is also the only one that could be inlined when used within a bigger query. (Not with an INSERT, though.)

A rather comprehensive list of differences between SQL and PL/pgSQL functions:

  • Difference between language sql and language plpgsql in PostgreSQL functions

Starting with Postgres 11 there are also SQL procedures:

  • When to use stored procedure / user-defined function?

Difference between language sql and language plpgsql in PostgreSQL functions

SQL functions

... are the better choice:

  • For simple scalar queries. Not much to plan, better save any overhead.

  • For single (or very few) calls per session. Nothing to gain from plan caching via prepared statements that PL/pgSQL has to offer. See below.

  • If they are typically called in the context of bigger queries and are simple enough to be inlined.

  • For lack of experience with any procedural language like PL/pgSQL. Many know SQL well and that's about all you need for SQL functions. Few can say the same about PL/pgSQL. (Though it's rather simple.)

  • A bit shorter code. No block overhead.

PL/pgSQL functions

... are the better choice:

  • When you need any procedural elements or variables that are not available in SQL functions, obviously.

  • For any kind of dynamic SQL, where you build and EXECUTE statements dynamically. Special care is needed to avoid SQL injection. More details:

    • Postgres functions vs prepared queries
  • When you have computations that can be reused in several places and a CTE can't be stretched for the purpose. In an SQL function you don't have variables and would be forced to compute repeatedly or write to a table. This related answer on dba.SE has side-by-side code examples for solving the same problem using an SQL function / a plpgsql function / a query with CTEs:

    • How to pass a parameter into a function

Assignments are somewhat more expensive than in other procedural languages. Adapt a programming style that doesn't use more assignments than necessary.

  • When a function cannot be inlined and is called repeatedly. Unlike with SQL functions, query plans can be cached for all SQL statements inside a PL/pgSQL functions; they are treated like prepared statements, the plan is cached for repeated calls within the same session (if Postgres expects the cached (generic) plan to perform better than re-planning every time. That's the reason why PL/pgSQL functions are typically faster after the first couple of calls in such cases.

    Here is a thread on pgsql-performance discussing some of these items:

    • Re: pl/pgsql functions outperforming sql ones?
  • When you need to trap errors.

  • For trigger functions.

  • When including DDL statements changing objects or altering system catalogs in any way relevant to subsequent commands - because all statements in SQL functions are parsed at once while PL/pgSQL functions plan and execute each statement sequentially (like a prepared statement). See:

    • Why can PL/pgSQL functions have side effect, while SQL functions can't?

Also consider:

  • PostgreSQL Stored Procedure Performance

To actually return from a PL/pgSQL function, you could write:

CREATE FUNCTION f2(istr varchar)
RETURNS text AS
$func$
BEGIN
RETURN 'hello! '; -- defaults to type text anyway
END
$func$ LANGUAGE plpgsql;

There are other ways:

  • Can I make a plpgsql function return an integer without using a variable?
  • The manual on "Returning From a Function"

PL/pgSQL perform vs execute

PERFORM is plpgsql command used for calls of void functions. PLpgSQL is careful about useless SELECT statements - the SELECT without INTO clause is not allowed. But sometimes you need to call a function and you don't need to store result (or functions has no result). The function in SQL is called with SELECT statement. But it is not possible in PLpgSQL - so the command PERFORM was introduced.

CREATE OR REPLACE FUNCTION foo()
RETURNS void AS $$
BEGIN
RAISE NOTICE 'Hello from void function';
END;
$$ LANGUAGE plpgsql;

-- direct call from SQL
SELECT foo();

-- in PLpgSQL
DO $$
BEGIN
SELECT foo(); -- is not allowed
PERFORM foo(); -- is ok
END;
$$;

The PERFORM statements execute a parameter and forgot result.

Your example perform 'create table foo as (select 1)';

is same like SELECT 'create table foo as (select 1)'. It returns a string "create table foo as (select 1)" and this string is discarded.

The EXECUTE statement evaluate a expression to get string. In next step this string is executed.

So EXECUTE 'create table ' || some_var || '(a int)'; has two steps

  1. evaluate expression 'create table ' || some_var || '(a int)'
  2. if some_var is mytab for example, then execute a command create table mytab(a int)

The PERFORM statement is used for function calls, when functions are not used in assignment statement. The EXECUTE is used for evaluation of dynamic SQL - when a form of SQL command is known in runtime.

Why does PostgreSQL treat my query differently in a function?

Update in PostgreSQL 9.2

There was a major improvement, I quote the release notes here:

Allow the planner to generate custom plans for specific parameter
values even when using prepared statements (Tom Lane)

In the past, a prepared statement always had a single "generic" plan
that was used for all parameter values, which was frequently much
inferior to the plans used for non-prepared statements containing
explicit constant values. Now, the planner attempts to generate custom
plans for specific parameter values. A generic plan will only be used
after custom plans have repeatedly proven to provide no benefit. This
change should eliminate the performance penalties formerly seen from
use of prepared statements (including non-dynamic statements in
PL/pgSQL).



Original answer for PostgreSQL 9.1 or older

A plpgsql functions has a similar effect as the PREPARE statement: queries are parsed and the query plan is cached.

The advantage is that some overhead is saved for every call.

The disadvantage is that the query plan is not optimized for the particular parameter values it is called with.

For queries on tables with even data distribution, this will generally be no problem and PL/pgSQL functions will perform somewhat faster than raw SQL queries or SQL functions. But if your query can use certain indexes depending on the actual values in the WHERE clause or, more generally, chose a better query plan for the particular values, you may end up with a sub-optimal query plan. Try an SQL function or use dynamic SQL with EXECUTE to force a the query to be re-planned for every call. Could look like this:

CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN
RETURN QUERY EXECUTE
'SELECT *
FROM table_name
where id = $1'
USING id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;

Edit after comment:

If this variant does not change the execution time, there must be other factors at play that you may have missed or did not mention. Different database? Different parameter values? You would have to post more details.

I add a quote from the manual to back up my above statements:

An EXECUTE with a simple constant command string and some USING
parameters, as in the first example above, is functionally equivalent
to just writing the command directly in PL/pgSQL and allowing
replacement of PL/pgSQL variables to happen automatically. The
important difference is that EXECUTE will re-plan the command on each
execution, generating a plan that is specific to the current parameter
values; whereas PL/pgSQL normally creates a generic plan and caches it
for re-use. In situations where the best plan depends strongly on the
parameter values, EXECUTE can be significantly faster; while when the
plan is not sensitive to parameter values, re-planning will be a
waste.

Why can PL/pgSQL functions have side effect, while SQL functions can't?

You bolded the key sentence in the manual yourself:

The entire body of a SQL function is parsed before any of it is executed.

Also read about The Parser Stage in the manual.

It consists of two major parts: the parser and the transformation process. Quoting the manual:

the transformation process takes the tree handed back by the parser as
input and does the semantic interpretation needed to understand which
tables, functions, and operators are referenced by the query.

If an SQL function contains these commands:

CREATE TABLE foo (...);
INSERT INTO foo VALUES(...);

Both statements are planned at virtually the same time (based on the same snapshot of the system catalogs). Hence, the INSERT cannot see the table "foo" presumably created with the previous CREATE command. That creates one of the following problems:

  1. If there is no other table named "foo" in your search_patch (yet), Postgres complains when trying to create the function:

    ERROR:  relation "foo" does not exist
  2. If another table named "foo" already exists in your search_patch (and you don't use conflicting column names), Postgres will plan the INSERT based on that pre-existing table. Typically that results in an error at execution time, if any values cause conflicts in the (wrong!) table. Or, with some bad luck, it might even write to that table without error message! Very sneaky bug.

That cannot happen with a PL/pgSQL function, because it treats SQL commands like prepared statements, planned and executed sequentially. So each statement can see objects created in previous statements.

Consequently, statements that are never visited are never even planned - unlike with SQL functions. And the execution plan for statements can be cached within the same session - also unlike SQL functions. Read details about plan caching in PL/pgSQL functions in the manual here.

Each approach has advantages for some use cases. Further reading:

  • Difference between language sql and language plpgsql in PostgreSQL functions

Using prepared statement in stored function

Looks to me like the PL/PgSQL EXECUTE for dynamic SQL trumps the regular SQL EXECUTE for prepared statements.

Code:

create or replace function prep_test() returns void as $$
begin
PREPARE do_something AS SELECT 1;
EXECUTE do_something;
end;
$$ LANGUAGE 'plpgsql';

Test:

regress=# select prep_test(1);
ERROR: column "do_something" does not exist
LINE 1: SELECT do_something
^
QUERY: SELECT do_something
CONTEXT: PL/pgSQL function "prep_test" line 4 at EXECUTE statement

outside PL/PgSQL it works fine:

regress=# EXECUTE do_something;
?column?
----------
1
(1 row)

I'm not sure how you'd execute a prepared statement within PL/PgSQL.

Out of interest, why are you trying to use prepared statements within PL/PgSQL? Plans are prepared and cached for PL/PgSQL anyway, it happens automatically.

Syntax error in dynamic SQL in pl/pgsql function

When creating a PL/pgSQL function, the function body is saved as string literal as is. Only superficial syntax checks are applied. Contained statements are not actually executed or tested on a deeper level.

However, basic syntax errors like you have in your query string would still be detected in actual SQL statements. But you are using dynamic SQL with EXECUTE. The statement is contained in a nested string literal and is your responsibility alone.

This seems to be misguided to begin with. There is no apparent reason for dynamic SQL. (Unless you have very uneven data distribution and want to force Postgres to generate a custom plan for each input value.)

If you had used a plain SQL statement, you would have gotten the error message at creation time:

CREATE OR REPLACE FUNCTION search_person(name text)  -- still incorrect!
RETURNS TABLE(address_id integer, address_geom text, event_name text) AS
$func$
BEGIN
RETURN QUERY
SELECT address.id, event.name, address.geom
FROM event JOIN person JOIN address JOIN person_address JOIN event_person
WHERE
person_address.event_id = event.id AND
event_person.event_id = event.id AND
person.id = event_person.person_id AND
person.name like $1; -- still $1, but refers to func param now!
END
$func$ LANGUAGE plpgsql;

The SQL statement is still invalid. [INNER] JOIN requires a join condition - like Nick commented. And I don't see the need for PL/pgSQL at all. A simple SQL function should serve well:

CREATE FUNCTION search_person(name text)
RETURNS TABLE(address_id integer, address_geom text, event_name text) AS
$func$
SELECT a.id, a.geom, e.name -- also fixed column order to match return type
FROM person AS p
JOIN event_person AS ep ON ep.person_id = p.id
JOIN event AS e ON e.id = ep.event_id
JOIN person_address AS pa ON pa.event_id = e.id
JOIN address AS a ON a.id = pa.address_id -- missing join condition !!
WHERE p.name LIKE $1;
$func$ LANGUAGE sql;

I rewrote the query to fix syntax error, using table aliases for better readability. Finally, I also added one more missing condition based on an educated guess: a.id = pa.address_id.

Now it should work.

Related:

  • plpgsql function not inserting data as intended
  • Difference between language sql and language plpgsql in PostgreSQL functions

Or no function at all, just use a prepared statement instead. Example:

  • Split given string and prepare case statement

If you need dynamic SQL after all, pass values with the USING clause like you had it and make sure to defend against SQL injection when concatenating queries. Postgres provides various tools:

  • SQL injection in Postgres functions vs prepared queries
  • Define table and column names as arguments in a plpgsql function?
  • Table name as a PostgreSQL function parameter

Which is better in terms of speed php or plpgsql?

Depends on what do you do. PL/pgSQL is optimized for data manipulation - PHP is optimized for html pages production. Some background technology is similar - and speed of basic structures is similar - PHP is significantly faster in string manipulations, but PLpgSQL runs in same address space as PostgreSQL database engine, and use same data types as PostgreSQL database engine, so there is zero overhead from data type conversions and interprocess communications.

Stored procedures has strong opponent and strong defenders - it is any other technology, and if you can use it well, it can serve perfect for small, for large projects. It is good for decomposition - it naturally divide application to presentation (interactive) layer and to data manipulation layer. It is important for data centric applications and less important for presentation centric applications. And opponents agree so, sometimes a stored procedures are necessary from performance reasons.

I disagree with kafsoksilo - debugging, unit testing, maintaining is not any issue - when you have knowledges about this technology - you can use almost all tools, that you know. And plpgsql language is pretty powerful (for data manipulation area) language - well documented with good diagnostic, clean and readable error messages and minimum issues.

DIfference Between Stored Procedures and Prepared Statements?

Stored procedures are a sequence of instructions in PL/SQL language. Is a programming language implemented by some DBMS, that lets you store sequences of queries frequently applied to your model, and share the processing load with the application layer.

Prepared statements are queries written with placeholders instead of actual values. You write the query and it is compiled just once by the DBMS, and then you just pass values to place into the placeholders. The advantage of using prepared statements is that you enhance the performance considerably, and protect your applications from SQL Injection.

The difference is you cant store prepared statements. You must "prepare" them every time you need to execute one. Stored procedures, on the other hand, can be stored, associated to a schema, but you need to know PL/SQL to write them.

You must check if your DBMS supports them.

Both are very usefull tools, you might want to combine.

Hope this short explanation to be useful to you!



Related Topics



Leave a reply



Submit