Why Are Relational Set-Based Queries Better Than Cursors

Cursor vs. set-based solutions for data manipulation tasks

The code above produces the error message when run in SQL Server 2008 R2, but runs successfully in SQL Server 2012.

OFFSET and FETCH are new features in 2012.

Why is it considered bad practice to use cursors in SQL Server?

Because cursors take up memory and create locks.

What you are really doing is attempting to force set-based technology into non-set based functionality. And, in all fairness, I should point out that cursors do have a use, but they are frowned upon because many folks who are not used to using set-based solutions use cursors instead of figuring out the set-based solution.

But, when you open a cursor, you are basically loading those rows into memory and locking them, creating potential blocks. Then, as you cycle through the cursor, you are making changes to other tables and still keeping all of the memory and locks of the cursor open.

All of which has the potential to cause performance issues for other users.

So, as a general rule, cursors are frowned upon. Especially if that's the first solution arrived at in solving a problem.

DB stored procedure operation design - set based vs cursor based

As succinctly as I can manage:

In relational database engines, all operations (whether in stored procedures or not) will usually* scale better using set-based logic simply because these engines are optimised for performing set-based operations.

There is a generally a fixed resource cost (which may be quite high) for a single atomic operation in the engine, whether it affects 1 or 1,000,000 rows.

Cursors incur even higher costs because the database engine must maintain the state of the cursor on top of the atomic operation cost.

*there are going to be a few edge cases/classes of problem (exactly which will depend on your RDBMS) in where procedural logic will perform better than set-based.

TSQL CURSOR and SET Based Alternatives

Go for the Relational in RDBMS and you ill see MSSQL, Oracle and others are systems optimized to work with sets. Cursors works more like imperative procedural languages like C#.

For a small example try to implement a join! You can mimic a join by using a cursor to do a nested loop.
Silly example for sure but you get the idea. A join ill be faster than that cursor.

Also note performance is not only about the fatest way. It's about using less resources. That resources are: CPU, Memory, IO, HD, users patience (time). Cursors can consume all that resources.

Sometimes cursors can be optimzed using FAST FORWARD and others tricks. Eventualy a cursor can be a option and even the best tool for the work (they exists for a motive).

The problem with cursors is they are over used by developers with lack of set based experience. Those guys tries to apply that C-like programing style to the relational world with horrific results.

Edit
Here a example borrowed from SQL Shack

DECLARE @rowguidVar UNIQUEIDENTIFIER  -- prepare unique ID variable to use in the WHERE statement below

DECLARE test_cursor CURSOR FOR
SELECT rowguid
FROM AdventureWorks2012.Sales.SalesOrderDetail
WHERE ModifiedDate BETWEEN '2008-07-15 00:00:00.000' AND '2008-07-31 00:00:00.000'

OPEN test_cursor
FETCH NEXT FROM test_cursor INTO @rowguidVar
--This is the start of the cursor loop.
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT *
FROM Sales.SalesOrderDetail
WHERE rowguid = @rowguidVar
FETCH NEXT FROM test_cursor INTO @rowguidVar
END

CLOSE test_cursor
DEALLOCATE test_cursor
-- Don't forget these statements which flush the cursor from memory

Is the same of

SELECT  *
FROM AdventureWorks2012.Sales.SalesOrderDetail
WHERE ModifiedDate BETWEEN '2008-07-15 00:00:00.000' AND '2008-07-31 00:00:00.000'

Query porformance vs performance of the same query in cursor

You have a performance problem, and as such you need to investigate it as a performance problem. Please read How to analyse SQL Server performance.

Now, you are comparing a SELECT with a stored procedure that does INSERT for one or more rows in that SELECT. To expect them to be similar time is naive, at best. You are comparing reads with writes. Think: reads come from cache, writes go to disk.

You did not post any performance investigation data, so I'll use my magic 8-ball roll, which tells me you are issuing each INSERT in a standalone transaction and thus waiting for commit to flush for every INSERT. You cannot expect more than ~100 commits (rows) per second like this. You need to batch commit. Or, if on SQL Server 2014 or later, use delayed durability.

Shnugo also gives good advice, is (almost) always better to use one set operation instead of a cursor, when possible.



Related Topics



Leave a reply



Submit