Atomically Mark and Return a Group of Rows in Database

Atomically mark and return a group of rows in database

Your table of jobs is a queue. Writing user tables backed up queues is a notoriously error prone as it leads to deadlocks and concurency issues.

The simplest thing would be to drop the user table and use a true queue instead. This will give you deadlock free concurency free queue on system tested and validated code base. The problem is that the whole paradigm around queues changes from INSERT and DELETE/UPDATE to SEND/RECEIVE. On the other hand with built-in queue you get some very powerfull free goodies, namely Activation and correlated items locking.

If you want to continue down the path of user table backed queues then the second most important trick in writing user tables queues is to use UPDATE ... OUTPUT:

WITH cte AS (
  SELECT TOP(20) status, id, ...
  FROM table WITH (ROWLOCK, READPAST, UPDLOCK)
  WHERE status = 'new'
  ORDER BY enqueue_time)
UPDATE cte
  SET status = 'processing'
OUTPUT
  INSERTED.id, ...

The CTE syntax is just for convenience of placing the TOP and ORDER BY properly, the query can be written using derived tables just as esily. You cannot use straight UPDATE ... TOP because UPDATE does not support an ORDER BY and you require this to satisfy the 'oldest' part of your requirement. The lock hints are needed to facilitate high concurency between parallel processing threads.

I said this is the second most important trick. The most important is how you organize the table. For a queue it must be clustered by (status, enqueue_time). If you don't organize the table properly you'll end up with deadlocks. Pre-emptive comment: fragmentation is irelevant in this scenario.

how to atomically claim a row or resource using UPDATE in mysql

UPDATE cars SET user = 'bob' WHERE id = 123 AND user IS NULL;

The update query returns the number of changed rows. If it has not updated any, you know the car has already been claimed by someone else.

Alternatively, you can use SELECT ... FOR UPDATE.

How to Select a record from the database and update it in an atomic query

You can use OUTPUT clause:

UPDATE [table]
SET Status = 'InProcess'
OUTPUT deleted.*
WHERE Status = 'Pending'

Here you can use inserted table name if you want to get row with new status or deleted when old.

SQL Server Process Queue Race Condition

Edit:

I googled to check my answer: "Processing Data Queues in SQL Server with READPAST and UPDLOCK". It's been years since I read about and played with this solution.

Original:

If you use the READPAST hint, then locked rows are skipped. You've used ROWLOCK so you should avoid lock escalation. You also need UPDLOCK, as I found out.

So process 1 locks 20 rows, process 2 will take the next 20, process 3 takes rows 41 to 60, etc

The update can also be written like this:

UPDATE TOP (20)
    foo
SET
    ProcessorID = @PROCID
FROM
    OrderTable foo WITH (ROWLOCK, READPAST, UPDLOCK)
WHERE
    ProcessorID = 0

Refresh, Oct 2011

This can be done more elegantly with the OUTPUT clause if you need a SELECT and an UPDATE in one go.

Is a single SQL Server statement atomic and consistent?

I've been operating under the assumption that a single statement in SQL Server is consistent

That assumption is wrong. The following two transactions have identical locking semantics:

STATEMENT

BEGIN TRAN; STATEMENT; COMMIT

No difference at all. Single statements and auto-commits do not change anything.

So merging all logic into one statement does not help (if it does, it was by accident because the plan changed).

Let's fix the problem at hand. SERIALIZABLE will fix the inconsistency you are seeing because it guarantees that your transactions behave as if they executed single-threadedly. Equivalently, they behave as if they executed instantly.

You will be getting deadlocks. If you are ok with a retry loop, you're done at this point.

If you want to invest more time, apply locking hints to force exclusive access to the relevant data:

UPDATE Gifts  -- U-locked anyway
SET GivenAway = 1
WHERE GiftID = (
   SELECT TOP 1 GiftID
   FROM Gifts WITH (UPDLOCK, HOLDLOCK) --this normally just S-locks.
   WHERE g2.GivenAway = 0
    AND (SELECT COUNT(*) FROM Gifts g2 WITH (UPDLOCK, HOLDLOCK) WHERE g2.GivenAway = 1) < 5
   ORDER BY g2.GiftValue DESC
)

You will now see reduced concurrency. That might be totally fine depending on your load.

The very nature of your problem makes achieving concurrency hard. If you require a solution for that we'd need to apply more invasive techniques.

You can simplify the UPDATE a bit:

WITH g AS (
   SELECT TOP 1 Gifts.*
   FROM Gifts
   WHERE g2.GivenAway = 0
    AND (SELECT COUNT(*) FROM Gifts g2 WITH (UPDLOCK, HOLDLOCK) WHERE g2.GivenAway = 1) < 5
   ORDER BY g2.GiftValue DESC
)
UPDATE g  -- U-locked anyway
SET GivenAway = 1

This gets rid of one unnecessary join.

What is atomicity in dbms

"Every column should be atomic."

Chris Date says, "Please note very carefully that it is not just simple things like the integer 3 that are legitimate values. On the contrary, values can be arbitrarily complex; for example, a value might be a geometric point, or a polygon, or an X ray, or an XML document, or a fingerprint, or an array, or a stack, or a list, or a relation (and so on)."[1]

He also says, "A relvar is in 1NF if and only if, in every legal value of that relvar, every tuple contains exactly one value for each attribute."[2]

He generally discourages the use of the word atomic, because it has confusing connotations. Single value is probably a better term to use.

For example, a date like '2014-01-01' is a single value. It's not indivisible; on the contrary, it quite clearly is divisible. But the dbms does one of two things with single values that have parts. The dbms either returns those values as a whole, or the dbms provides functions to manipulate the parts. (Clients don't have to write code to manipulate the parts.)[3]

In the case of dates, SQL can

return dates as a whole (SELECT CURRENT_DATE),
return one or more parts of a date (EXTRACT(YEAR FROM CURRENT_DATE)),
add and subtract intervals (CURRENT_DATE + INTERVAL '1' DAY),
subtract one date from another (CURRENT_DATE - DATE '2014-01-01'),

and so on. In this (narrow) respect, SQL is quite relational.

An Introduction to Database Systems, 8th ed, p 113. Emphasis in the original.
Ibid, p 358.
In the case of a "user-defined" type, the "user" is presumed to be a database programmer, not a client of the database.

Is there a way to SELECT and UPDATE rows at the same time?

Consider looking at the OUTPUT clause:

USE AdventureWorks2012;  
GO  

DECLARE @MyTableVar table(  
    EmpID int NOT NULL,  
    OldVacationHours int,  
    NewVacationHours int,  
    ModifiedDate datetime);  

UPDATE TOP (10) HumanResources.Employee  
SET VacationHours = VacationHours * 1.25,  
    ModifiedDate = GETDATE()   
OUTPUT inserted.BusinessEntityID,  
       deleted.VacationHours,  
       inserted.VacationHours,  
       inserted.ModifiedDate  
INTO @MyTableVar;  

--Display the result set of the table variable.  
SELECT EmpID, OldVacationHours, NewVacationHours, ModifiedDate  
FROM @MyTableVar;  
GO  
--Display the result set of the table.  
SELECT TOP (10) BusinessEntityID, VacationHours, ModifiedDate  
FROM HumanResources.Employee;  
GO

How to atomically delete keys matching a pattern using Redis

Starting with redis 2.6.0, you can run lua scripts, which execute atomically. I have never written one, but I think it would look something like this

EVAL "return redis.call('del', unpack(redis.call('keys', ARGV[1])))" 0 prefix:[YOUR_PREFIX e.g delete_me_*]

Warning: As the Redis document says, because of performance maters, keys
command should not use for regular operations in production, this
command is intended for debugging and special operations. read
more

See the EVAL documentation.

Atomically Mark and Return a Group of Rows in Database