Sql Server - Is Using @@Rowcount Safe in Multithreaded Applications

SQL Server - is using @@ROWCOUNT safe in multithreaded applications?

Yes - its safe. It always refers the previous operation in current query

BUT

if you want to know the number of rows affected, save it to variable first, because after IF statement the count @@ROWCOUNT resets

INSERT INTO A (ID) VALUES (1)
DECLARE @rc INT = @@ROWCOUNT
IF @rc = 0
PRINT 'NO ROWS AFFECTED'
ELSE
SELECT @rc AS RowsAffected

Scope of @@rowcount?

Granted, the article is for SQL Server 2000, but one would hope the scope doesn't change between versions. According to the article How triggers affect ROWCOUNT and IDENTITY in SQL Server 2000, @@ROWCOUNT will not be affected by triggers.

Specifically:

It’s safe to use @@ROWCOUNT in SQL Server 2000 even when there is a trigger on the base table. The trigger will not skew your results; you’ll get what you expect. @@ROWCOUNT works correctly even when NOCOUNT is set.

So if you update three rows, and the trigger updates five rows elsewhere, you'll get a @@ROWCOUNT of 3.

Also, from GBN's answer in SQL Server - is using @@ROWCOUNT safe in multithreaded applications?:

@@ROWCOUNT is both scope and connection safe.

In fact, it reads only the last statement row count for that connection and scope.

How @@rowcount works if two queries executed simultaneously in two query windows?

It will give the corresponding rows affected for the two queries. Two windows here, are effectively two sessions. So they are not affected in any way whatsoever.

@@ROWCOUNT is both scope and connection safe.

In fact, it reads only the last statement row count for that connection and scope. The full rules are here on MSDN (cursors, DML, EXECUTE etc)

To use it in subsequent statements, you need to store it in a local variable.

How to update large table with millions of rows in SQL Server?

  1. You should not be updating 10k rows in a set unless you are certain that the operation is getting Page Locks (due to multiple rows per page being part of the UPDATE operation). The issue is that Lock Escalation (from either Row or Page to Table locks) occurs at 5000 locks. So it is safest to keep it just below 5000, just in case the operation is using Row Locks.

  2. You should not be using SET ROWCOUNT to limit the number of rows that will be modified. There are two issues here:

    1. It has that been deprecated since SQL Server 2005 was released (11 years ago):

      Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax

    2. It can affect more than just the statement you are dealing with:

      Setting the SET ROWCOUNT option causes most Transact-SQL statements to stop processing when they have been affected by the specified number of rows. This includes triggers. The ROWCOUNT option does not affect dynamic cursors, but it does limit the rowset of keyset and insensitive cursors. This option should be used with caution.

    Instead, use the TOP () clause.

  3. There is no purpose in having an explicit transaction here. It complicates the code and you have no handling for a ROLLBACK, which isn't even needed since each statement is its own transaction (i.e. auto-commit).

  4. Assuming you find a reason to keep the explicit transaction, then you do not have a TRY / CATCH structure. Please see my answer on DBA.StackExchange for a TRY / CATCH template that handles transactions:

    Are we required to handle Transaction in C# Code as well as in Store procedure

I suspect that the real WHERE clause is not being shown in the example code in the Question, so simply relying upon what has been shown, a better model (please see note below regarding performance) would be:

DECLARE @Rows INT,
@BatchSize INT; -- keep below 5000 to be safe

SET @BatchSize = 2000;

SET @Rows = @BatchSize; -- initialize just to enter the loop

BEGIN TRY
WHILE (@Rows = @BatchSize)
BEGIN
UPDATE TOP (@BatchSize) tab
SET tab.Value = 'abc1'
FROM TableName tab
WHERE tab.Parameter1 = 'abc'
AND tab.Parameter2 = 123
AND tab.Value <> 'abc1' COLLATE Latin1_General_100_BIN2;
-- Use a binary Collation (ending in _BIN2, not _BIN) to make sure
-- that you don't skip differences that compare the same due to
-- insensitivity of case, accent, etc, or linguistic equivalence.

SET @Rows = @@ROWCOUNT;
END;
END TRY
BEGIN CATCH
RAISERROR(stuff);
RETURN;
END CATCH;

By testing @Rows against @BatchSize, you can avoid that final UPDATE query (in most cases) because the final set is typically some number of rows less than @BatchSize, in which case we know that there are no more to process (which is what you see in the output shown in your answer). Only in those cases where the final set of rows is equal to @BatchSize will this code run a final UPDATE affecting 0 rows.

I also added a condition to the WHERE clause to prevent rows that have already been updated from being updated again.

NOTE REGARDING PERFORMANCE

I emphasized "better" above (as in, "this is a better model") because this has several improvements over the O.P.'s original code, and works fine in many cases, but is not perfect for all cases. For tables of at least a certain size (which varies due to several factors so I can't be more specific), performance will degrade as there are fewer rows to fix if either:

  1. there is no index to support the query, or
  2. there is an index, but at least one column in the WHERE clause is a string data type that does not use a binary collation, hence a COLLATE clause is added to the query here to force the binary collation, and doing so invalidates the index (for this particular query).

This is the situation that @mikesigs encountered, thus requiring a different approach. The updated method copies the IDs for all rows to be updated into a temporary table, then uses that temp table to INNER JOIN to the table being updated on the clustered index key column(s). (It's important to capture and join on the clustered index columns, whether or not those are the primary key columns!).

Please see @mikesigs answer below for details. The approach shown in that answer is a very effective pattern that I have used myself on many occasions. The only changes I would make are:

  1. Explicitly create the #targetIds table rather than using SELECT INTO...
  2. For the #targetIds table, declare a clustered primary key on the column(s).
  3. For the #batchIds table, declare a clustered primary key on the column(s).
  4. For inserting into #targetIds, use INSERT INTO #targetIds (column_name(s)) SELECT and remove the ORDER BY as it's unnecessary.

So, if you don't have an index that can be used for this operation, and can't temporarily create one that will actually work (a filtered index might work, depending on your WHERE clause for the UPDATE query), then try the approach shown in @mikesigs answer (and if you use that solution, please up-vote it).

Solutions for INSERT OR UPDATE on SQL Server

don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.

When multiple threads will try to perform Insert-or-update you can easily
get primary key violation.

Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.

To avoid deadlocks and PK violations you can use something like this:

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
update table set ...
where key = @key
end
else
begin
insert into table (key, ...)
values (@key, ...)
end
commit tran

or

begin tran
update table with (serializable) set ...
where key = @key

if @@rowcount = 0
begin
insert into table (key, ...) values (@key,..)
end
commit tran

Reflow INT UNSIGNED Series

You could reflow the rows like this:

UPDATE BoxPeg bp
INNER JOIN (
SELECT *, @n := @n + 1 AS newpos
FROM BoxPeg, (SELECT @n := -1) x
WHERE box = '{BOXID}'
ORDER BY position
) s
ON bp.box = s.box
AND bp.peg = s.peg
SET bp.position = s.newpos
WHERE bp.position <> s.newpos
;

The subselect uses a variable to calculate new position values for a subset defined by a specific {BOXID}. The UPDATE statement then joins the subselect to BoxPeg and updates the matching rows with the new values. (Thus, when joined to BoxPeg, the subselect is also acting as a filter, limiting the updated rows to those belonging to the given box.)

A SQL Fiddle demo of this query is available.

Note that if the position column was part of a unique constraint, this method might not always work. Although the new values this query generates are unique, MySQL seems to evaluate unique constraints during the update rather than at the end of the statement's execution. Thus, depending on the order in which the rows would be updated, the query might fail if a row was receiving a value identical to that in another, not yet updated, row.

Concurrency handling of Sql transactrion

First of all the way which you describe in your question is in my opinion the best way for ASP.NET application with MS SQL as a database. There is no locking in the database. It is perfect with permanently disconnected clients like web clients.

How one can read from some answers, there is a misunderstanding in the terminology. We all mean using Microsoft SQL Server 2008 or higher to hold the database. If you open in the MS SQL Server 2008 documentation the topic "rowversion (Transact-SQL)" you will find following:

"timestamp is the synonym for the
rowversion data type and is subject to
the behavior of data type synonym." …
"The timestamp syntax is deprecated.
This feature will be removed in a
future version of Microsoft SQL
Server. Avoid using this feature in
new development work, and plan to
modify applications that currently use
this feature."

So timestamp data type is the synonym for the rowversion data type for MS SQL. It holds 64-bit the counter which exists internally in every database and can be seen as @@DBTS. After a modification of one row in one table of the database, the counter will be incremented.

As I read your question I read "TimeStamp" as a column name of the type rowversion data. I personally prefer the name RowUpdateTimeStamp. In AzManDB (see Microsoft Authorization Manager with the Store as DB) I could see such name. Sometimes were used also ChildUpdateTimeStamp to trace hierarchical RowUpdateTimeStamp structures (with respect of triggers).

I implemented this approach in my last project and be very happy. Generally you do following:

  1. Add RowUpdateTimeStamp column to every table of you database with the type rowversion (it will be seen in the Microsoft SQL Management Studio as timestamp, which is the same).
  2. You should construct all you SQL SELECT Queries for sending results to the client so, that you send additional RowVersion value together with the main data. If you have a SELECT with JOINTs, you should send RowVersion of the maximum RowUpdateTimeStamp value from both tables like
SELECT s.Id AS Id
,s.Name AS SoftwareName
,m.Name AS ManufacturerName
,CASE WHEN s.RowUpdateTimeStamp > m.RowUpdateTimeStamp
THEN s.RowUpdateTimeStamp
ELSE m.RowUpdateTimeStamp
END AS RowUpdateTimeStamp
FROM dbo.Software AS s
INNER JOIN dbo.Manufacturer AS m ON s.Manufacturer_Id=m.Id

Or make a data casting like following

SELECT s.Id AS Id
,s.Name AS SoftwareName
,m.Name AS ManufacturerName
,CASE WHEN s.RowUpdateTimeStamp > m.RowUpdateTimeStamp
THEN CAST(s.RowUpdateTimeStamp AS bigint)
ELSE CAST(m.RowUpdateTimeStamp AS bigint)
END AS RowUpdateTimeStamp
FROM dbo.Software AS s
INNER JOIN dbo.Manufacturer AS m ON s.Manufacturer_Id=m.Id

to hold RowUpdateTimeStamp as bigint, which corresponds ulong data type of C#. If you makes OUTER JOINTs or JOINTs from many tables, the construct MAX(RowUpdateTimeStamp) from all tables will be seen a little more complex. Because MS SQL don't support function like MAX(a,b,c,d,e) the corresponding construct could looks like following:

(SELECT MAX(rv)
FROM (SELECT table1.RowUpdateTimeStamp AS rv
UNION ALL SELECT table2.RowUpdateTimeStamp
UNION ALL SELECT table3.RowUpdateTimeStamp
UNION ALL SELECT table4.RowUpdateTimeStamp
UNION ALL SELECT table5.RowUpdateTimeStamp) AS maxrv) AS RowUpdateTimeStamp

  1. All disconnected clients (web clients) receive and hold not only some rows of data, but RowVersion (type ulong) of the data row.
  2. In one try to modify data from the disconnected client, you client should send the RowVersion corresponds to the original data to server. The spSoftwareUpdate stored procedure could look like
CREATE PROCEDURE dbo.spSoftwareUpdate
@Id int,
@SoftwareName varchar(100),
@originalRowUpdateTimeStamp bigint, -- used for optimistic concurrency mechanism
@NewRowUpdateTimeStamp bigint OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
-- ExecuteNonQuery() returns -1, but it is not an error
-- one should test @NewRowUpdateTimeStamp for DBNull
SET NOCOUNT ON;

UPDATE dbo.Software
SET Name = @SoftwareName
WHERE Id = @Id AND RowUpdateTimeStamp <= @originalRowUpdateTimeStamp

SET @NewRowUpdateTimeStamp = (SELECT RowUpdateTimeStamp
FROM dbo.Software
WHERE (@@ROWCOUNT > 0) AND (Id = @Id));
END

Code of dbo.spSoftwareDelete stored procedure look like the same. If you don’t switch on NOCOUNT, you can produce DBConcurrencyException automatically generated in a lot on scenarios. Visual Studio gives you possibilities to use optimistic concurrency like "Use optimistic concurrency" checkbox in Advanced Options of the TableAdapter or DataAdapter.

If you look at dbo.spSoftwareUpdate stored procedure carful you will find, that I use RowUpdateTimeStamp <= @originalRowUpdateTimeStamp in WHERE instead of RowUpdateTimeStamp = @originalRowUpdateTimeStamp. I do so because, the value of @originalRowUpdateTimeStamp which has the client typically are constructed as a MAX(RowUpdateTimeStamp) from more as one tables. So it can be that RowUpdateTimeStamp < @originalRowUpdateTimeStamp. Either you should use strict equality = and reproduce here the same complex JOIN statement as you used in SELECT statement or use <= construct like me and stay exact the same safe as before.

By the way, one can construct very good value for ETag based on RowUpdateTimeStamp which can sent in HTTP header to the client together with data. With the ETag you can implement intelligent data caching on the client side.

I can’t write whole code here, but you can find a lot of examples in Internet. I want only repeat one more time that in my opinion usage optimistic concurrency based on rowversion is the best way for the most of ASP.NET scenarios.

SQL Server - Return value after INSERT

No need for a separate SELECT...

INSERT INTO table (name)
OUTPUT Inserted.ID
VALUES('bob');

This works for non-IDENTITY columns (such as GUIDs) too



Related Topics



Leave a reply



Submit