Are SQL queries guaranteed to execute atomically when using UNION?
Yes the statement is atomic but yes the data can change between the 2 reads.
Read Committed
only guarantees that you don't read dirty data it promises nothing else about consistency of reads for that you would need a higher isolation level.
As you said that you would accept a SQL Server Example...
Connection 1
(Assumes under pessimistic read committed isolation level)
CREATE TABLE employee
(
name VARCHAR(50),
DepartmentID INT
)
CREATE TABLE department
(
DepartmentID INT
)
INSERT INTO department VALUES (1)
INSERT INTO employee VALUES ('bob',1)
declare @employee TABLE
(
name VARCHAR(50),
DepartmentID INT
)
WHILE ((SELECT COUNT(*) FROM @employee) < 2)
BEGIN
DELETE FROM @employee
INSERT INTO @employee
SELECT employee.*
FROM employee
LEFT JOIN department
ON employee.DepartmentID = department.DepartmentID
UNION
SELECT employee.*
FROM employee
RIGHT JOIN department
ON employee.DepartmentID = department.DepartmentID
END;
SELECT * FROM @employee
Connection 2
while (1=1)
UPDATE employee SET name = CASE WHEN name = 'bob' THEN 'bill' else 'bob' END
Now go back to connection 1
name DepartmentID
-------------------------------------------------- ------------
bill 1
bob 1
(Remember to switch back to Connection 2 to kill it!)
The specific documentation covering this READ COMMITED
behaviour is here
The shared lock type determines when
it will be released. Row locks are
released before the next row is
processed. Page locks are released
when the next page is read, and table
locks are released when the statement
finishes.
Is update with nested select atomic operation?
SELECT
places shared locks on the rows read which then can be lifted in READ COMMITED
isolation mode.
UPDATE
places the update locks later promoted to exclusive locks. They are not lifted until the end of the transaction.
You should make the locks to retain as soon as they are placed.
You can do it by making the transaction isolation level REPEATABLE READ
which will retain the shared locks until the end of the transaction and will prevent UPDATE
part from locking these rows.
Alternatively, you can rewrite your query as this:
WITH q AS
(
SELECT TOP 10000 *
FROM mytable WITH (ROWLOCK, READPAST)
WHERE batch_id IS NULL
ORDER BY
date
)
UPDATE q
SET batch_id = @myid
, which will just skip the locked rows.
Are queries inside a MySQL transaction guaranteed to be executed together?
You still have a possibility of a race condition, even though you execute the SELECT followed by the UPDATE in a single transaction. SELECT by itself does not lock anything, so you could have two concurrent sessions both SELECT and get the same id. Then both would attempt to UPDATE, but only one would "win" - the other would have to wait.
To get around this, use the SELECT...FOR UPDATE clause, which creates a lock on the rows it returns.
Prepare Transaction...
$id = SELECT id
FROM companies
WHERE processing = 0
ORDER BY last_crawled ASC
LIMIT 1
FOR UPDATE;
This means that the lock is created as the row is selected. This is atomic, which means no other session can sneak in and get a lock on the same row. If they try, their transaction will block on the SELECT.
UPDATE companies
SET processing = 1
WHERE id = $id;
Commit Transaction
I changed your "execute transaction" pseudocode to "commit transaction." Statements within a transaction execute immediately, which means they create locks and so on. Then when you COMMIT, the locks are released and any changes are committed. Committed means they can't be rolled back, and they are visible to other transactions.
Here's a quick example of using mysqli to accomplish this:
$mysqli = new mysqli(...);
$mysqli->report_mode = MYSQLI_REPORT_STRICT; /* throw exception on error */
$mysqli->begin_transaction();
$sql = "SELECT id
FROM companies
WHERE processing = 0
ORDER BY last_crawled ASC
LIMIT 1
FOR UPDATE";
$result = $mysqli->query($sql);
while ($row = $result->fetch_array(MYSQLI_ASSOC)) {
$id = $row["id"];
}
$sql = "UPDATE companies
SET processing = 1
WHERE id = ?";
$stmt = $mysqli->prepare($sql);
$stmt->bind_param("i", $id);
$stmt->execute();
$mysqli->commit();
Re your comment:
I tried an experiment and created a table companies
, filled it with 512 rows, then started a transaction and issues the SELECT...FOR UPDATE
statement above. I did this in the mysql client, no need to write PHP code.
Then, before committing my transaction, I examined the locks reported:
mysql> show engine innodb status\G
=====================================
2013-12-04 16:01:28 7f6a00117700 INNODB MONITOR OUTPUT
=====================================
...
---TRANSACTION 30012, ACTIVE 2 sec
2 lock struct(s), heap size 376, 513 row lock(s)
...
Despite using LIMIT 1
, this report shows transaction appears to lock every row in the table (plus 1, for some reason).
So you're right, if you have hundreds of requests per second, it's likely that the transactions are queuing up. You should be able to verify this by watching SHOW PROCESSLIST
and seeing many processes stuck in a state of Locked
(i.e. waiting for access to rows that another thread has locked).
If you have hundreds of requests per second, you may have outgrown the ability for an RDBMS to function as a fake message queue. This isn't what an RDBMS is good at.
There are a variety of scalable message queue frameworks with good integration with PHP, like RabbitMQ, STOMP, AMQP, Gearman, Beanstalk.
Check out http://www.slideshare.net/mwillbanks/message-queues-a-primer-international-php-conference-fall-2012
T-SQL Is a sub query for an Update restriction Atomic with the update?
If I understand the motivation for the question you want to avoid the possibility that two concurrent transactions could both execute the sub query to get the top N rows to process then proceed to update the same rows?
In that case I'd use this approach.
;WITH cte As
(
SELECT TOP(@numberToProcess)
*
FROM
ToBeProcessed WITH(UPDLOCK,ROWLOCK,READPAST)
WHERE
ToBeProcessed.IsBeingProcessed = 0
ORDER BY
ToBeProcessed.Id,
ToBeProcessed.Priority DESC
)
UPDATE
cte
SET
IsBeingProcessed = 1
OUTPUT
INSERTED.Id
INTO
#IdsToProcess
I was a bit uncertain earlier whether SQL Server would take U
locks when processing your version with the sub query thus blocking two concurrent transactions from reading the same TOP N
rows. This does not appear to be the case.
Test Table
CREATE TABLE JobsToProcess
(
priority INT IDENTITY(1,1),
isprocessed BIT ,
number INT
)
INSERT INTO JobsToProcess
SELECT TOP (1000000) 0,0
FROM master..spt_values v1, master..spt_values v2
Test Script (Run in 2 concurrent SSMS sessions)
BEGIN TRY
DECLARE @FinishedMessage VARBINARY (128) = CAST('TestFinished' AS VARBINARY (128))
DECLARE @SynchMessage VARBINARY (128) = CAST('TestSynchronising' AS VARBINARY (128))
SET CONTEXT_INFO @SynchMessage
DECLARE @OtherSpid int
WHILE(@OtherSpid IS NULL)
SELECT @OtherSpid=spid
FROM sys.sysprocesses
WHERE context_info=@SynchMessage and spid<>@@SPID
SELECT @OtherSpid
DECLARE @increment INT = @@spid
DECLARE @number INT = @increment
WHILE (@number = @increment AND NOT EXISTS(SELECT * FROM sys.sysprocesses WHERE context_info=@FinishedMessage))
UPDATE JobsToProcess
SET @number=number +=@increment,isprocessed=1
WHERE priority = (SELECT TOP 1 priority
FROM JobsToProcess
WHERE isprocessed=0
ORDER BY priority DESC)
SELECT *
FROM JobsToProcess
WHERE number not in (0,@OtherSpid,@@spid)
SET CONTEXT_INFO @FinishedMessage
END TRY
BEGIN CATCH
SET CONTEXT_INFO @FinishedMessage
SELECT ERROR_MESSAGE(), ERROR_NUMBER()
END CATCH
Almost immediately execution stops as both concurrent transactions update the same row so the S
locks taken whilst identifying the TOP 1 priority
must get released before it aquires a U
lock then the 2 transactions proceed to get the row U
and X
lock in sequence.
If a CI is added ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED (priority)
then deadlock occurs almost immediately instead as in this case the row S
lock doesn't get released, one transaction aquires a U
lock on the row and waits to convert it to an X
lock and the other transaction is still waiting to convert its S
lock to a U
lock.
If the query above is changed to use MIN
rather than TOP
WHERE priority = (SELECT MIN(priority)
FROM JobsToProcess
WHERE isprocessed=0
)
Then SQL Server manages to completely eliminate the sub query from the plan and takes U
locks all the way.
SQLite query problem, and atomicity of transactions
If I were you. I made a unique 'umembers' table.
I filled it with triggers. (members1 and members2 too) You can figure out the update trigger.
CREATE TRIGGER IF NOT EXISTS members1trigger
BEFORE INSERT ON members1
BEGIN
INSERT INTO umembers VALUES(new.uname, other properties);
END;
Then your update is soo
#!/usr/bin/env bash
sqlite3 test.db <<EOF
update members1 set uname=$2
where uname='$1' and not exists (select 1 from umembers where uname= $2);
EOF
Of course you should create unique constraint on umembers.
And if there are paralell update then second perhaps can throw exception.
But I think index locks it, and can't happen.Security says use case insensitive logins.
https://security.stackexchange.com/questions/241701/case-sensitive-logins
Do databases always lock non-existent rows after a query or update?
Answering my own question since there seems to be a lot of confusion around the topic. It seems that:
-- BAD! DO NOT DO THIS! --
insert customer (email, count)
select 'foo@example.com', 0
where not exists (
select 1 from customer
where email = 'foo@example.com'
)
is open to race-conditions (see Only inserting a row if it's not already there). From what I've been able to gather, the only portable solution to this problem:
- Pick a key to merge against. This could be the primary key, or another unique key, but it must have a unique constraint.
- Try to
insert
a new row. You must catch the error that will occur if the row already exists. - The hard part is over. At this point, the row is guaranteed to exist and you are protected from race-conditions by the fact that you are holding a write-lock on it (due to the
insert
from the previous step). - Go ahead and
update
if needed orselect
its primary key.
Oracle Union with order by clause issue
The underlying obstacle is that a SQL dataset is inherently UNordered. As soon as you UNION two datasets you lose any guarantee of ordering previously present.
You can often get away with the following structure, but it is STILL not guaranteed...
SELECT
*
FROM
(
(
select to_char(first_name||'('||substr(last_name,0,1)||')')
from employees
order by first_name
)
Individuals
UNION ALL
(
select to_char('There are total '||count(job_id)||' '||lower(job_id)||'s')
from employees
group by job_id
order by count(job_id),job_id
)
Totals
)
Combined
;
In practice you Often get what you want with this structure.
The brackets ensure the ordering is done before the UNION ALL and the database engine PROBABLY won't screw with the ordering.
But it MIGHT. The only way to Guarantee the order of the results is to put an ORDER BY on the outer query. Something like the following normally works fairly well...
SELECT
rowdata
FROM
(
(
select 1 AS myset, rownum AS myrow, to_char(first_name||'('||substr(last_name,0,1)||')') AS rowdata
from employees
order by first_name
)
Individuals
UNION ALL
(
select 2 AS myset, rownum AS myrow, to_char('There are total '||count(job_id)||' '||lower(job_id)||'s')
from employees
group by job_id
order by count(job_id),job_id
)
Totals
)
Combined
ORDER BY
myset, myrow
;
I'm on my phone, so there may be typos, but the sentiment is there...
Use ROWNUM or ROW_NUMBER() to generate an extra field in your data sets. Then union them. Then order by your new field(s).
Related Topics
Query for Comma-Separated Ids to Comma-Separated Values
Sql Server Delete Is Slower with Indexes
Database View Does Not Reflect The Data in The Underying Table
Sql How to Remove Duplicates Within Select Query
What Is The Meaning of Kanatype Sensitive Ks and Width Sensitive
Is It Faster to Check If Length = 0 Than to Compare It to an Empty String
Indexed View Vs Indexes on Table
Join Table Twice - on Two Different Columns of The Same Table
Sql Server Database Change Workflow Best Practices
How to Pass Schema as Parameter to a Stored Procedure in SQL Server
Replacing Certain Character in Email Addresses with '*' in an SQL Query
Cannot Connect to SQL Server Express from Ssms
Sql Two Tables and Creating a Link Table
Sql Server Reverse Order After Using Desc
Sql 2005 How to Use Keyword Like in a Case Statement
Difference Between <> and != in Sql
Sql Server - "For JSON Path" Statement Does Not Return More Than 2984 Lines of JSON String