DELETE performance in SQL Server on clustered index, large table
In addition to the fine points JNK included in their answer, one particular killer I've seen is when you're deleting rows from the referenced table for one or more foreign key constraints, and the referencing column(s) in the referencing table(s) aren't indexed - you're forcing a table scan on each of those tables to occur before the delete can be accepted.
Delete statement in SQL is very slow
Things that can cause a delete to be slow:
- deleting a lot of records
- many indexes
- missing indexes on foreign keys in child tables. (thank you to @CesarAlvaradoDiaz for mentioning this in the comments)
- deadlocks and blocking
- triggers
- cascade delete (those ten parent records you are deleting could mean
millions of child records getting deleted) - Transaction log needing to grow
- Many Foreign keys to check
So your choices are to find out what is blocking and fix it or run the deletes in off hours when they won't be interfering with the normal production load. You can run the delete in batches (useful if you have triggers, cascade delete, or a large number of records). You can drop and recreate the indexes (best if you can do that in off hours too).
DELETE operation in SQL Server extremely slow
Nice execution plan!!!
There are a series of cascade deletes (which in turn fire other cascade deletes).
every delete-->spool is yet another cascade deletion: the deleted rows are put into a spool/temp structure and the spool is used for finding the referencing rows from other tables.
eg. delete tblAS0002 -> spool[tblAS0002]
delete history{join}spool[tblAS0002] -> spool[history]
delete history_links{join}spool[history]
Looking into the execution plan, I can see only one operation takeing
over 1 minute, and it is a scan of the primary key index of the
tblAS0002. But why?
this is the operation at the bottom of the plan.
There is a self-reference in tblAS0002: tblAS0002.IdParent --> references tblAS0002.Id
There is no index for IdParent, deleting a single Id requires a table scan of tblAS0002 in order to verify a not exists== assert:!(left semi join) , that the deleted id is not referenced by any IdParent.
Delete operation is slow and rebuilding index doesn't seems to solve the issue
Right. So your problem is not the deletion of records, which is instantaneous. (84 rows). The problem is the scan on the WholesalerSale table afterwards.
My guess would be, that ImportSale.Id is a foreign key in WholesalerSale, and that SqlServer simply validates that you haven't deleted a referenced key.
Solution is to index your foreign key column in WholesalerSale to speed up this check.
CREATE INDEX IX_WholesalerSale_ImportSaleId ON WholesalerSales (ImportSaleId);
Very slow DELETE query
Add a Primary key to your table variables and watch them scream
DECLARE @IdList1 TABLE(Id INT primary Key not null)
DECLARE @IdList2 TABLE(Id INT primary Key not null)
because there's no index on these table variables, any joins or subqueries must examine on the order of 10,000 times 10,000 = 100,000,000 pairs of values.
SQL Server: Clustered index considerably slower than equivalent non-clustered index
Looking at your query the first thing to consider is that you include in the SELECT list a spacial column which is a .NET/CLR data type and these are stored outside the IN_ROW_DATA
pages requiring key look-up unless the spacial column is included in the index which potentially also includes the spacial bounding-box in the index data pages to speed up the filtering saving most of the disk I/O. I would say you uncovered an efficient trick to speed up spacial columns filtering without the need of an spacial index.
To prove my point I refer you to the original SQL documentation, which I'm sure you already know, about covering indexes where it clarifies the following: "Non-key columns added to the leaf level
of a nonclustered index to improve query performance. This allows the query optimizer to locate all the required information from an index scan; the table or clustered index data is not accessed.". The last part is very important here, so I assume the bounding box is part of the "required information" of a spacial column to help the query optimizer avoid accessing the IN_ROW_DATA
.
Conclusion:
- Can it be true that the simple lookup operation can explain the difference in performance between the indices described in (1) and (2)? I would say so because of the spacial CLR data type being stored outside the
IN_ROW_DATA
pages requiring much more disk I/O in (1). - Why is the clustered index described in (3) considerable slower than the index described in (2)? Same reason, including the Geography data in the index (2) saves the need to look that up outside the
IN_ROW_DATA
pages saving most of the disk I/O; bear in mind that index (3) still needs to look-up the spacial data in theLOB_DATA
. - If neither of the two above can be answered, should we see such a big performance deficit when comparing such two indicies as described in question 1 and 2 or is it more likely something else is wrong with our setup? N/A.
Slow DELETE statement in SQL Server need help reading execution plan and fixing
Make sure you have indexes on the FKs in the other tables.
Related Topics
How to Use Order by with Union All in SQL
MySQL Select Query String Matching
How to Format Bigint Field into a Date in Postgresql
How to Insert Data into Table Using Stored Procedures in Postgresql
Postgres: Define a Default Value for Cast Failures
Select Distinct on One Column, Return Multiple Other Columns (SQL Server)
Ole Db Provider 'Microsoft.Jet.Oledb.4.0' Cannot Be Used for Distributed Queries
Update Multiple Tables in SQL Server Using Inner Join
Flattening Intersecting Timespans
How to Treat Max() of an Empty Table as 0 Instead of Null
Checking for Time Range Overlap, the Watchman Problem [Sql]
Oracle - Best Select Statement for Getting the Difference in Minutes Between Two Datetime Columns
Select Distinct Values from 1 Column
SQL Injection After Removing All Single-Quotes and Dash-Characters
Is There a Simple Way to Query the Children of a Node
How to Insert Multiple Rows with a Foreign Key Using a Cte in Postgres