Does Postgres Support Nested or Autonomous Transactions

Does Postgres support nested or autonomous transactions?

There are no autonomous transactions in Postgres before Postgres 11, where SQL procedures were added. Everything that's done in a function is rolled back with the transaction.

Here is a discussion of the feature:

http://wiki.postgresql.org/wiki/Autonomous_subtransactions

In Postgres 10 or older a workaround could be to (ab-)use dblink:

dblink can't update a table on the same database in an after UPDATE trigger
How do I do large non-blocking updates in PostgreSQL?

There is also the related concept of a SAVEPOINT. (Not the same thing!):

plpython

plpython has subtransactions (with plpy.subtransaction():), but that's not the same as autonomous transactions. There is no separate COMMIT. All it does, is bundle a couple of statements together to make them atomic. Without that, if an exception occurs somewhere in the middle, and you catch that exception, only the code up to this exception would be executed. If you wrap it into a subtransaction, it's all or nothing. This is like using a SAVEPOINT, not an autonomous transaction. Per documentation:

The subtransaction context manager does not trap errors, it only
assures that all database operations executed inside its scope will be
atomically committed or rolled back.

Does PostgreSQL count nested BEGIN and END statements, even though it doesn't support autonomous transactions?

As the documentation says, a second BEGIN does not affect the transaction state. That means it cannot cause the following COMMIT to be ignored. So the first COMMIT will indeed commit the transaction, and everything after that will proceed as you've described.

The simplest solution is to shift the responsibility for transaction control to the caller. Put your BEGIN / COMMIT outside the call to doFooPlus(), and you no longer need to worry about which subroutine is in charge of initiating the transaction. Even simpler, implement foo() and fooPlus() as server-side functions - which are inherently atomic - and forget about client-side transaction control altogether.

That said, the kind of ad-hoc transaction management pattern in your example can be useful. Ultimately, implementing this would just involve passing some additional information alongside the PGConn pointer, and checking it around every BEGIN / COMMIT / ROLLBACK.

Of course, this whole pattern is a little unwieldy; if you need to do it more than a couple of times, you'd want something to encapsulate the connection, the transaction control functions and the state (a simple depth counter would probably do it). But before setting out to write your own API, I'd take a look at what's already out there.

What is supported as transactional in postgres

Best I'm aware all of these commands are transaction aware, except for TRUNCATE ... RESTART IDENTITY (and even that one is transactional since 9.1.)

See the manual on concurrency control and transaction-related commands.

Committing transactions while executing a postgreql Function

This can be done using dblink. I showed an example with one insert being committed you will need to add your while loop logic and commit every loop. You can http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html

CREATE OR REPLACE FUNCTION log_the_dancing(ip_dance_entry text)
RETURNS INT AS
$BODY$
    DECLARE
    BEGIN
        PERFORM dblink_connect('dblink_trans','dbname=sandbox port=5433 user=postgres');
        PERFORM dblink('dblink_trans','INSERT INTO dance_log(dance_entry) SELECT ' || '''' || ip_dance_entry || '''');
        PERFORM dblink('dblink_trans','COMMIT;');
        PERFORM dblink_disconnect('dblink_trans'); 

        RETURN 0;
    END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

ALTER FUNCTION log_the_dancing(ip_dance_entry text)
  OWNER TO postgres;

BEGIN TRANSACTION;
  select log_the_dancing('The Flamingo');
  select log_the_dancing('Break Dance');
  select log_the_dancing('Cha Cha');
ROLLBACK TRANSACTION;

--Show records committed even though we rolled back outer transaction
select *
from dance_log;

Does Postgresql implicitly wrap select statements in transaction?

Does this mean that even select statement will be implicitly wrapped in transaction and there is no way to work with postgres out of transaction scope?

Yes.

Transactions aren't just about writes, but also isolation level and what SELECTs can read. For example, at the usual isolation level, a SELECT can't read uncommitted writes. It also can't read the updates done from transactions that commit while the SELECT is running. Otherwise it wouldn't have a coherent view of the database and basically everything would break. For example it would look up in an index, then go get the corresponding row, and ithe row could have changed and no longer correspond to the index.

That requires a way to take a "snapshot" of the database right before the SELECT begins to execute, and that is one important function of the transaction mechanism.

Nested transaction rollback between two savepoints?

Savepoints will not do what you want. When you roll back to a savepoint, everything after that savepoint is rolled back, irrespective of whether later savepoints were created.

Think of savepoints like a "stack". You can't pull something out of the middle of the stack, you have to remove everything down to the layer you want.

You are probably looking for autonomous transactions. None of the databases you want to use support them. In PostgreSQL you can work around this using the dblink module to make a new connection to the database and do work with it; see http://www.postgresql.org/docs/current/static/dblink.html . I don't know what solutions MySQL or SQLite offer, but Google will help now that you know the term you are looking for.

I recommend that you find a way to work around this application design requirement if possible. Have your application use two database connections and two transactions to do what you need, taking care of co-ordinating the two as required.

How to force COMMIT inside function so other sessions can see updated row?

Keep reading. I preserved the best for last.

Proof of concept with a `PROCEDURE`

A Postgres FUNCTION is always atomic (runs inside a single transaction wrapper) and cannot handle transactions. So COMMIT is disallowed. You could use tricks with dblink to work around this. See:

Does Postgres support nested or autonomous transactions?
How do I do large non-blocking updates in PostgreSQL?

But for nested transactions like this consider a PROCEDURE instead. Introduced with Postgres 11. There you can manage transactions:

CREATE OR REPLACE PROCEDURE aaa.proc_work(_id text, INOUT _result text = NULL)
  LANGUAGE plpgsql AS
$proc$
BEGIN
   -- optionally assert steering row exists
   PERFORM FROM aaa.monitor WHERE id = _id FOR KEY SHARE SKIP LOCKED;

   IF NOT FOUND THEN   
      RAISE EXCEPTION 'monitor.id = % not found or blocked!', quote_literal(_id);
   END IF;

   -- try UPDATE
   UPDATE aaa.monitor
   SET    status = 'running'
   WHERE  id = _id                   -- assuming valid _id
   AND    status <> 'running';       -- assuming "status" is NOT NULL

   IF NOT FOUND THEN
      _result := 'running'; RETURN;  -- this is how you return with INOUT params
   END IF;

   COMMIT;                           -- HERE !!!

   BEGIN                             -- start new code block

      ----- code for big work HERE -----
      -- PERFORM 1/0;                -- debug: test exception?
      -- PERFORM pg_sleep(5);        -- debug: test concurrency?

      _result := '';

   -- also catching QUERY_CANCELED and ASSERT_FAILURE
   -- is a radical step to try andrelease 'running' rows no matter what
   EXCEPTION WHEN OTHERS OR QUERY_CANCELED OR ASSERT_FAILURE THEN
      -- ROLLBACK;                   -- roll back (unfinished?) big work
      _result := SQLERRM;
   END;                              -- end of nested block

   UPDATE aaa.monitor                -- final reset
   SET    status = 'idle'
   WHERE  id = _id
   AND    status <> 'idle';          -- only if needed
END
$proc$;

Call (important!):

CALL aaa.proc_work('invoicing');  -- stand-alone call!

Important notes

Add COMMIT after the UPDATE. After that, concurrent transactions can see the updated row.

But there is no additional BEGIN or START TRANSACTION. The manual:

In procedures invoked by the CALL command as well as in anonymous code
blocks (DO command), it is possible to end transactions using the
commands COMMIT and ROLLBACK. A new transaction is started
automatically after a transaction is ended using these commands, so
there is no separate START TRANSACTION command. (Note that BEGIN and
END have different meanings in PL/pgSQL.)

We need a separate PL/pgSQL code block, because you have a custom exception handler, and (quoting the manual):

A transaction cannot be ended inside a block with exception handlers.

(But we can COMMIT / ROLLBACK in the EXCEPTION handler.)

You cannot call this procedure inside an outer transaction, or together with any other DML statement, which would force an outer transaction wrapper. Has to be a stand-alone CALL. See:

Unable to Create COMMIT inside PostgreSQL 11.5 Procedure

Note the final UPDATE aaa.monitor SET status = 'idle' WHERE .... Else the (committed!) status would remain 'running' indefinitely after an exception.

About returning a value from a procedure:

How to return a value from a stored procedure (not function)?

I added DEFAULT NULL to the INOUT parameter, so you don't have to provide an argument with the call.

UPDATE directly. If the row is 'running', no update occurs. (This also fixes the logic: your IF expression seems backwards as it returns 'running' when no row with status='running' is found. Seems like you'd want the opposite.)

I added an (optional!) assert to make sure the row in table aaa.monitor exists. Adding a FOR KEY SHARE lock to also eliminate the tiny time window for a race conditions between the assert and the following UPDATE. The lock conflicts with deletion or updating the PK column - but not with updating the status. So the exception is never raised in normal operation! The manual:

Currently, the set of columns considered for the UPDATE case are
those that have a unique index on them that can be used in a foreign
key (so partial indexes and expressional indexes are not considered),
but this may change in the future.

SKIP LOCK to not wait in case of a conflicting lock. The added exception should never occur. Just demonstrating a water-tight proof of concept.

Your update revealed 25 rows in aaa.monitor, so I added the parameter _id.

Superior approach

The above might make sense to persist more information for the world to see. To just queue operations, there are much more efficient solutions. Work with a lock instead, which is "visible" to others instantaneously. Then you don't need a nested transaction to begin with, and a plain FUNCTION will do:

CREATE OR REPLACE FUNCTION aaa.fnc_work(_id text)
  RETURNS text
  LANGUAGE plpgsql AS
$func$
BEGIN
   -- optionally assert that the steering row exists
   PERFORM FROM aaa.monitor WHERE id = _id FOR KEY SHARE SKIP LOCKED;
   IF NOT FOUND THEN   
      RAISE EXCEPTION 'monitor.id = % not found or blocked!', quote_literal(_id);
   END IF;

   -- lock row
   PERFORM FROM aaa.monitor WHERE id = _id FOR NO KEY UPDATE SKIP LOCKED;

   IF NOT FOUND THEN
      -- we made sure the row exists, so it must be locked
      RETURN 'running';
   END IF;

   ----- code for big work HERE -----
   -- PERFORM 1/0;                -- debug: test exception?
   -- PERFORM pg_sleep(5);        -- debug: test concurrency?

   RETURN '';

EXCEPTION WHEN OTHERS THEN
   RETURN SQLERRM;

END
$func$;

Call:

SELECT aaa.fnc_work('invoicing');

The call can be nested any way you want. As long as one transaction is working on the big job, no other will start.

Again, the optional assert takes out a FOR KEY SHARE lock to eliminate the time window for a race condition, and the added exception should never occur in normal operation.

We don't need the column status at all for this. The row-lock itself is the gatekeeper. Hence the empty SELECT list in PERFORM FROM aaa.monitor .... Collateral benefit: this also doesn't produce dead tuples by updating the row back and forth. If you still need to update status for some other reason, you are back to the visibility issue of the previous chapter. You can combine both ...

About PERFORM:

SELECT or PERFORM in a PL/pgSQL function

About the row lock:

Postgres UPDATE … LIMIT 1

Do autonomous transactions have their own session in Oracle?

A session can have multiple transactions, so this is all done within a single session.

SQL> create table t ( n number);

Table created.

SQL>
SQL> create or replace
  2  procedure p1 is
  3      pragma autonomous_transaction;
  4  begin
  5      insert into t values ( sys_context('userenv','sid'));
  6      commit;
  7  end;
  8  /

Procedure created.

SQL> create or replace
  2  procedure p2 is
  3      pragma autonomous_transaction;
  4  begin
  5      insert into t values ( sys_context('userenv','sid'));
  6      commit;
  7      p1;
  8  end;
  9  /

Procedure created.

SQL> create or replace
  2  procedure p3 is
  3      pragma autonomous_transaction;
  4  begin
  5      insert into t values ( sys_context('userenv','sid'));
  6      commit;
  7      p2;
  8  end;
  9  /

Procedure created.

SQL>
SQL> exec p3;

PL/SQL procedure successfully completed.

SQL> select * from t;

         N
----------
       982
       982
       982

Does Postgres Support Nested or Autonomous Transactions