Postgresql and Locking

Postgres row locking

You can do it using select for update. FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends.

ATTENTION:

SELECT FOR UPDATE will wait for a concurrent transaction that has run
any of those commands on the same row, and will then lock and return
the updated row (or no row, if the row was deleted)

Now I will write for you query samples and explain how to do it:
Suppose we have one such table:

Table name: key_table

key is_used
00001 true
00002 true
00003 false
00004 false
00005 false


select key from key_table
where
is_used = false
order by key desc
limit 1 for update
into v_key;

update key_table
set
is_used = true
where key = v_key;

After the select command selected row will be locked. And this row cannot be selected by user in any other session until be updated. All users will be waiting for this update when selecting this row. After update command users can be shown next row which is_used = false

Does Postgres lock all rows in a query atomically, even across different tables via JOIN?

The query will lock the rows one after the other as they are selected. The exact order will depend on the execution plan. Perhaps you can add FOR UPDATE OF table_name to lock rows only in the table where you need them locked.

I have two more ideas:

  • rewrite the query so that it locks the rows in a certain order:

    WITH b AS MATERIALIZED (
    SELECT id, table_a_id
    FROM tableb
    WHERE id = 42
    FOR NO KEY UPDATE
    )
    SELECT b.id
    FROM tablea
    WHERE EXISTS (SELECT 1 FROM b
    WHERE tablea.id = b.table_a_id)
    ORDER BY tablea.id
    FOR NO KEY UPDATE;

    Performance may not be as good, but if everybody selects like that, you won't get a deadlock.

  • lock the tables:

    LOCK TABLE tablea, tableb IN EXCLUSIVE MODE;

    That lock will prevent concurrent row locks and data modifications, so you will be safe from a deadlock.

    Only do that as a last-ditch effort, and don't do it too often. If you frequently take high table locks like that, you keep autovacuum from running and endanger the health of your database.

PostgreSQL - Does row locking depends on update syntax in transaction?

Both transactions lock the same two rows, and both can run into a deadlock. The difference is that the first transaction always locks the two rows in a certain order.

If your objective is to avoid deadlocks the first transaction is better: if you make a rule that any transaction must update the user with the lower id first, then no such two transactions can deadlock with each other (but they can still deadlock with other transactions that do not obey that rule).

PostgreSQl correct lock on creation of new entity

@Lock(LockModeType.PESSIMISTIC_WRITE)
TaskApplication findByUserAndTask(User user, Task task);

will obtain pessimitic lock only on entites returned by the query (row-level lock). In situation when result set is empty, no lock is being obtained and findByUserAndTask doesn't block transaction.

There are few ways how you could handle concurrent inserts:

  1. Use unique index to prevent adding duplicates and handle exception proper to your application needs
  2. If your database supports it, obtain table-level lock on the table where you want to insert data. JPA doesn't support it.
  3. Emulate table-level lock using row-level lock on new entity & table dedicated for storing locks. This new table should have a row per each table on which you want to obtain pessimistic lock on insert
    public enum EntityType {
TASK_APPLICATION
}
    @Getter
@Entity
public class TableLock {
@Id
private Long id

@Enumerated(String)
private EntityType entityType;
}
    public interface EntityTypeRepository extends Repository<TableLock, Long> {

@Lock(LockModeType.PESSIMISTIC_WRITE)
TableLock findByEntityType(EntityType entityType);
}

Having such setup you just need to obtain lock:

@Transactional
public TaskApplication createIfNotExists(User user, Task task) {
TaskApplication taskApplication = taskApplicationRepository.findByUserAndTask(user, task);

if (taskApplication == null) {
findByEntityType(EntityType.TASK_APPLICATION);
taskApplication = taskApplicationRepository.findByUserAndTask(user, task);

if (taskApplication == null) {
taskApplication = create(user, task);
}
}

return taskApplication;
}

For most cases first approach (unique index) is the best and most efficient. Obtaining table lock (native / emualated) is heavy and should be used with care.

I'm not sure about PostgreSQL, but typically it should block the execution based on unique index in case of no record.

Presence of indices does not affect if select / insert statement are blocking or not. Such behaviour is controlled by pessimistic locks.

PostgreSQL select for update lock, new rows

The issue is that each command only sees rows that have been committed before the query started. There are various possible solutions ...

Stricter isolation level

You can solve this with a stricter isolation level, but that's relatively expensive.

Laurenz already provided a solution for this.

Just start a new command

Keep the (cheap) default isolation level READ COMMITTED, and just start a new command.

Only few rows to lock

While only locking a hand full of rows, the dead simple solution is to repeat the same SELECT ... FOR UPDATE. The second iteration sees newly committed rows and locks them additionally.

There is a theoretical race condition with additional transactions that might lock new rows before the waiting transaction does. That would result in a deadlock. Highly unlikely, but to be absolutely sure, lock rows in consistent order:

BEGIN;  -- default READ COMMITTED

SELECT FROM table_a WHERE is_latest ORDER BY id FOR UPDATE; -- consistent order
SELECT * FROM table_a WHERE is_latest ORDER BY id FOR UPDATE; -- just repeat !!

-- DO SOME APP LOGIC TO TEST VALIDITY

-- pseudo-code
IF all_good
UPDATE table_a SET is_latest = true WHERE ...;
INSERT table_a (IS_LATEST, ...) VALUES (true, ...);
COMMIT;
ELSE
ROLLBACK;
END;

A partial index on (id) WHERE is_latest would be ideal.

More rows to lock

For more than a hand full of rows, I would instead create a dedicated one-row token table. A bullet-proof implementation could look like this, run as admin or superuser:

CREATE TABLE public.single_task_x (just_me bool CHECK (just_me) PRIMARY KEY DEFAULT true);
INSERT INTO public.single_task_x VALUES (true);
REVOKE ALL ON public.single_task_x FROM public;
GRANT SELECT, UPDATE ON public.single_task_x TO public; -- or just to those who need it

See:

  • How to allow only one row for a table?

Then:

BEGIN;  -- default READ COMMITTED

SELECT FROM public.single_task_x FOR UPDATE;
SELECT * FROM table_a WHERE is_latest; -- FOR UPDATE? ①

-- DO SOME APP LOGIC TO TEST VALIDITY

-- pseudo-code
IF all_good
ROLLBACK;
ELSE
UPDATE table_a SET is_latest = true WHERE ...;
INSERT table_a (IS_LATEST, ...) VALUES (true, ...);
COMMIT;
END;

A single lock is cheaper.

① You may or may not want to lock additionally, to defend against other writes, possibly with a weaker lock ....

Either way, all locks are released at the end of the transaction automatically.

Advisory lock

Or use an advisory lock. pg_advisory_xact_lock() persists for the duration of the transaction:

BEGIN;  -- default READ COMMITTED

SELECT pg_advisory_xact_lock(123);
SELECT * FROM table_a WHERE is_latest;

-- do stuff

COMMIT; -- or ROLLBACK;

Make sure to use a unique token for your particular task. 123 in my example. Consider a look-up table if you have many different tasks.

To release the lock at a different point in time (not when the transaction ends), consider a session-level lock with pg_advisory_lock(). Then you can (and must) unlock manually with pg_advisory_unlock() - or close the session.

Both of these wait for the locked resource. There are alternative functions returning false instead of waiting ...



Related Topics



Leave a reply



Submit