What Are the Conditions for Encountering a Serialization Failure

What are the conditions for encountering a serialization failure in PostgreSQL?

For REPEATABLE READ this example will do:

Prepare stage:

psql-0> CREATE TABLE foo(key int primary key, val int);
CREATE TABLE
psql-0> INSERT INTO foo VALUES(1, 42);

Now keep an eye on the psql-X part indicating the interleaving of actions:

psql-1> BEGIN ISOLATION LEVEL REPEATABLE READ;
psql-1> UPDATE foo SET val=val+1;
UPDATE 1
psql-2> BEGIN ISOLATION LEVEL REPEATABLE READ;
psql-2> UPDATE foo SET val=val+1;
*** no output, transaction blocked ***

psql-1> COMMIT;

psql-2> *** unblocks ***
ERROR: could not serialize access due to concurrent update

An example for SERIALIZABLE is in the documentation for PostgreSQL 9.1 and should be no problem from here.

What specific exceptions represent a serialization failure when Django is using serializable transaction isolation level with postgresql?

Hmm, good question. The documentation implies that the appropriate exception would be a TransactionManagementError:

TransactionManagementError is raised for any and all problems related to database transactions.

However, the source code gives a strong clue that it isn't:

class TransactionManagementError(ProgrammingError):
"""Transaction management is used improperly."""
pass

Note that this is a ProgrammingError, which is indeed used to indicate programmer error (i.e. "used improperly").

If we look at the documentation for psycopg (the Python adapter used for PostgreSQL support) we see that it will raise a psycopg2.extensions.TransactionRollbackError:

exception psycopg2.extensions.TransactionRollbackError (subclasses OperationalError)

Error causing transaction rollback (deadlocks, serialization failures, etc).

But what does Django do with that? Well, as documented here, it wraps the standard Python DB API 2.0 exceptions in Django equivalents, and sets the __cause__ attribute to the original exception. So the following is probably the most specific check you can make:

from django.db import OperationalError
from psycopg2.extensions import TransactionRollbackError

for retries in range(0, 3):
try:
with transaction.atomic():
MyModel.objects.update(foo='bar')
except OperationalError as e:
if e.__cause__.__class__ == TransactionRollbackError:
continue
else:
raise
else:
break

Depending on the error details exposed by PostgreSQL (available via e.__cause__.diag) it may be possible to write an even more specific test.

Generally, though, the Python DB API 2.0 documentation states that OperationalError is indeed the correct exception type for transaction issues, so catching that hopefully would be a reasonably effective database-agnostic solution.

Generalized handling of SQL transaction failures due to isolation level serializable

Your code sample shows exactly how it should be done.

Serialization failures should happen rarely; if they don't, something in your database or application design is unsound.

I guess it it possible to construct a scenario where a transaction can never succeed, but that is no problem in practice.

Is it possible that both transactions rollback during a deadlock or serialization error?

Yes. A deadlock can involve more than two transactions. In this case more than one may be terminated. But this is an extremely rare condition. Normally.

If just two transactions deadlock, one survives. The manual:

PostgreSQL automatically detects deadlock situations and resolves them by aborting one of the transactions involved, allowing the other(s) to complete.

Serialization failures only happen in REPEATABLE READ or SERIALIZABLE transaction isolation. I wouldn't know of any particular limit to how many serialization failures can happen concurrently. But I also never heard of any necessity to delay retrying.

I would retry as soon as possible either way.

No serialize access error is raised in PostgreSQL while it should be

The SELECT in step 4 does not rely on stale data, because the modification has not been committed yet. Until a transaction commits, the database acts as if it nothing has happened yet.

This is guaranteed by the READ COMMITTED isolation level and all higher ones. (PostgreSQL does not allow dirty reads, so you will get this behavior even if you request the READ UNCOMMITTED isolation level.)

Moreover, SERIALIZABLE means that the result of all transactions is guaranteed to be the same as that of some serialized execution of the transactions. And that is clearly the case here: If we serialize by executing the transaction on connection #1 before the one on connection #2, we get the same result.

It should never be necessary, but if you want to reduce concurrency by forcing a serialized execution, add the following statement immediately after each START TRANSACTION:

LOCK TABLE concurrency_test IN ACCESS EXCLUSIVE MODE;


Related Topics



Leave a reply



Submit