Mysql: Transactions VS Locking Tables

MySQL: Transactions vs Locking Tables

Locking tables prevents other DB users from affecting the rows/tables you've locked. But locks, in and of themselves, will NOT ensure that your logic comes out in a consistent state.

Think of a banking system. When you pay a bill online, there's at least two accounts affected by the transaction: Your account, from which the money is taken. And the receiver's account, into which the money is transferred. And the bank's account, into which they'll happily deposit all the service fees charged on the transaction. Given (as everyone knows these days) that banks are extraordinarily stupid, let's say their system works like this:

$balance = "GET BALANCE FROM your ACCOUNT";
if ($balance < $amount_being_paid) {
    charge_huge_overdraft_fees();
}
$balance = $balance - $amount_being paid;
UPDATE your ACCOUNT SET BALANCE = $balance;

$balance = "GET BALANCE FROM receiver ACCOUNT"
charge_insane_transaction_fee();
$balance = $balance + $amount_being_paid
UPDATE receiver ACCOUNT SET BALANCE = $balance

Now, with no locks and no transactions, this system is vulnerable to various race conditions, the biggest of which is multiple payments being performed on your account, or the receiver's account in parallel. While your code has your balance retrieved and is doing the huge_overdraft_fees() and whatnot, it's entirely possible that some other payment will be running the same type of code in parallel. They'll be retrieve your balance (say, $100), do their transactions (take out the $20 you're paying, and the $30 they're screwing you over with), and now both code paths have two different balances: $80 and $70. Depending on which ones finishes last, you'll end up with either of those two balances in your account, instead of the $50 you should have ended up with ($100 - $20 - $30). In this case, "bank error in your favor".

Now, let's say you use locks. Your bill payment ($20) hits the pipe first, so it wins and locks your account record. Now you've got exclusive use, and can deduct the $20 from the balance, and write the new balance back in peace... and your account ends up with $80 as is expected. But... uhoh... You try to go update the receiver's account, and it's locked, and locked longer than the code allows, timing out your transaction... We're dealing with stupid banks, so instead of having proper error handling, the code just pulls an exit(), and your $20 vanishes into a puff of electrons. Now you're out $20, and you still owe $20 to the receiver, and your telephone gets repossessed.

So... enter transactions. You start a transaction, you debit your account $20, you try to credit the receiver with $20... and something blows up again. But this time, instead of exit(), the code can just do rollback, and poof, your $20 is magically added back to your account.

In the end, it boils down to this:

Locks keep anyone else from interfering with any database records you're dealing with. Transactions keep any "later" errors from interfering with "earlier" things you've done. Neither alone can guarantee that things work out ok in the end. But together, they do.

in tomorrow's lesson: The Joy of Deadlocks.

BEGIN...COMMIT vs LOCK TABLES

LOCK TABLES is global; it blocks all sessions until you use UNLOCK TABLES. It means that access to the tables you lock becomes serial. This is probably overkill for what you want to do.

Whereas InnoDB does row-level locking. Two concurrent sessions can both write to the table as long as they don't lock overlapping sets of rows.

Interaction between transactions and LOCK TABLES is a bit confusing. Both the InnoDB storage engine and the MySQL server lock a table when you use LOCK TABLES. When you COMMIT your transaction, InnoDB releases its table lock, but MySQL server does not do the same. You need to use UNLOCK TABLES to release it.

LOCK TABLES was a way to control access to tables and ensure repeatable reads, when MySQL's default storage engine was MyISAM, which does not support transactions. The cases where you should use LOCK TABLES in modern versions of MySQL are fewer and fewer. In fact, you will probably never need to use LOCK TABLES.

For more details, read https://dev.mysql.com/doc/refman/8.0/en/lock-tables.html

How to properly use transactions and locks to ensure database integrity?

1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?

Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.

2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?

This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.

3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?

Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.

SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.

4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?

It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.

By default suspended statements fail after 50s.

5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?

Yes, SELECT ... FOR UPDATE on items table should be enough.

Yes, these selects wait, because FOR UPDATE is an exclusive lock.

Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.

6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?

Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.

What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!

If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.

Why do we need to lock a mysql table during SELECT when beginning a transaction causes any pending transaction to be committed?

Starting a transaction is not like a lock. That's a misunderstanding.

InnoDB implements "optimistic locking" by default. No locks are acquired by starting a transaction. When you execute a locking SQL statement, then locks are acquired as needed.

In your case, you should choose SELECT ... FOR UPDATE because that's what you are preparing to do — update the record after reading it.

Re your comments:

If you were to use SELECT LOCK IN SHARE MODE, it could lead to deadlocks.

SESSION 1                       SESSION 2

SELECT ... LOCK IN SHARE MODE
ok
                                SELECT ... LOCK IN SHARE MODE
                                ok
UPDATE
waits
                                UPDATE
                                waits

In the sequence above, you end up with two sessions waiting on each other, which is a deadlock. Neither one will give up, so MySQL has to kill one or the other.

Innodb Transaction or tables lock?

Plan A:

This assumes you can process N rows in less than, say, 2 seconds. You have N=50 -- this may be too large.

BEGIN;
SELECT ... LIMIT 50  FOR UPDATE;
... process ...
... gather a list of ids to delete ...
DELETE ... WHERE id IN (...)
COMMIT;

The more you grab, the faster it goes, but also the more likely it is to get deadlocks. When a deadlock occurs, simply start the transaction over. Also keep track of how often deadlocks occur, in order to tune the "50".

Plan B:

This is useful when the processing of an item takes "too long" for a transaction. I say 2 seconds is probably "too long".

Grab a row to process:
with autocommit=ON ...
UPDATE ... SET who_is_processing = $me,
               when_grabbed = NOW()
               id = LAST_INSERT_ID(id),
           WHERE when_grabbed IS NULL
             AND any-other-criteria
           LIMIT 1;
$id = SELECT LAST_INSERT_ID();

... process $id ...  (This may or may not involve transactions)

Release the row (or, in your case, delete it):
again, autocommit=ON suffices...
DELETE ... WHERE id = $id;

"Never" use table locks with InnoDB. (There may be use cases, but this is not one.)

Mysql: Transactions VS Locking Tables