SQL Server 2008: Ordering by Datetime Is Too Slow

SQL Server 2008: Ordering by datetime is too slow

Ordering by id probably uses a clustered index scan while ordering by datetime uses either sorting or index lookup.

Both these methods are more slow than a clustered index scan.

If your table is clustered by id, basically it means it is already sorted. The records are contained in a B+Tree which has a linked list linking the pages in id order. The engine should just traverse the linked list to get the records ordered by id.

If the ids were inserted in sequential order, this means that the physical order of the rows will match the logical order and the clustered index scan will be yet faster.

If you want your records to be ordered by datetime, there are two options:

  • Take all records from the table and sort them. Slowness is obvious.
  • Use the index on datetime. The index is stored in a separate space of the disk, this means the engine needs to shuttle between the index pages and table pages in a nested loop. It is more slow too.

To improve the ordering, you can create a separate covering index on datetime:

CREATE INDEX ix_mytable_datetime ON mytable (datetime) INCLUDE (field1, field2, …)

, and include all columns you use in your query into that index.

This index is like a shadow copy of your table but with data sorted in different order.

This will allow to get rid of the key lookups (since the index contains all data) which will make ordering by datetime as fast as that on id.

Update:

A fresh blog post on this problem:

  • SQL Server: clustered index and ordering

Order By Case of Dates is Slow

[By Panagiotis Kanavos]
The solution was to create a computed column that adds the date I want to show on screen and it is created during Insert.
Then this column is indexed and the performance is good.

It might not be applicable in any case
(e.g. some tables could be huge, or can have multiple such cases which would lead to too many computed columns and indexes which might not be a good solution),
but it works well for my case.

Slow SQL Queries, Order Table by Date?

It sounds like datetime may be a text based field and subsequently an index isn't being used?

Could you try the following to see if you have any speed improvement:

select distinct N.TestName 
from [DalsaTE].[dbo].[ResultsUut] U
inner join [DalsaTE].[dbo].[ResultsNumeric] N
on N.ModeDescription = 'Mode 8: Low Gain - Green-Blue'
and N.ResultsUutId = U.ResultsUutId
where U.DeviceName = 'BO-32-3HK60-00-R'
and U.StartDateTime > cast('2011-11-25 01:10:10.001' as datetime)
order by N.TestName

It would also be worth trying changing your inner join to a left outer join as those occasionally perform faster for no conceivable reason (at least one that I'm not aware of).

SQL Query is slow when ORDER BY statement added

As far as why the query is slower, the query is required to return the rows "in order", so it either needs to do a sort, or it needs to use an index.

Using the index with a leading column of CreatedDate, SQL Server can avoid a sort. But SQL Server would also have to visit the pages in the underlying table to evaluate whether the row is to be returned, looking at the values in Status and Name columns.

If the optimizer chooses not to use the index with CreatedDate as the leading column, then it needs to first locate all of the rows that satisfy the predicates, and then perform a sort operation to get those rows in order. Then it can return the first fifty rows from the sorted set. (SQL Server wouldn't necessarily need to sort the entire set, but it would need to go through that whole set, and do sufficient sorting to guarantee that it's got the "first fifty" that need to be returned.

NOTE: I suspect you already know this, but to clarify: SQL Server honors the ORDER BY before the TOP 50. If you wanted any 50 rows that satisfied the predicates, but not necessarily the 50 rows with the lowest values of DateCreated,you could restructure/rewrite your query, to get (at most) 50 rows, and then perform the sort of just those.


A couple of ideas to improve performance

Adding a composite index (as other answers have suggested) may offer some improvement, for example:

 ON Documents (Status, DateCreated, Name)

SQL Server might be able to use that index to satisfy the equality predicate on Status, and also return the rows in DateCreated order without a sort operation. SQL server may also be able to satisfy the predicate on Name from the index, limiting the number of lookups to pages in the underlying table, which it needs to do for rows to be returned, to get "all" of the columns for the row.


For SQL Server 2008 or later, I'd consider a filtered index... dependent on the cardinality of Status='New' (that is, if rows that satisfy the predicate Status='New' is a relatively small subset of the table.

  CREATE NONCLUSTERED INDEX Documents_FIX 
ON Documents (Status, DateCreated, Name)
WHERE Status = 'New'

I would also modify the query to specify ORDER BY Status, DateCreated, Name

so that the order by clause matches the index, it doesn't really change the order that the rows are returned in.


As a more complicated alternative, I would consider adding a persisted computed column and adding a filtered index on that

  ALTER TABLE Documents
ADD new_none_date_created AS
CASE
WHEN Status = 'New' AND COALESCE(Name,'') IN ('','None') THEN DateCreated
ELSE NULL
END
PERSISTED
;

CREATE NONCLUSTERED INDEX Documents_FIXP
ON Documents (new_none_date_created)
WHERE new_none_date_created IS NOT NULL
;

Then the query could be re-written:

  SELECT TOP 50 *
FROM Documents
WHERE new_none_date_created IS NOT NULL
ORDER BY new_none_date_created
;

Why is my SQL Server ORDER BY slow despite the ordered column being indexed?

If your query does not contain an order by then it will return the data in whatever oreder it was found. There is no guarantee that the data will even be returned in the same order when you run the query again.

When you include an order by clause, the dabatase has to build a list of the rows in the correct order and then return the data in that order. This can take a lot of extra processing which translates into extra time.

It probably takes longer to sort a large number of columns, which your query might be returning. At some point you will run out of buffer space and the database will have to start swapping and perfromance will go downhill.

Try returning less columns (specify the columns you need instead of Select *) and see if the query runs faster.

Slow Performance when ORDER BY in SQL Server

Your table index definitions are not optimal. You also don't have to created the additional individual indexes because they are covered by the Non Clustered Index. You will have better performance when structuring your indexes as follows:

Table definition:

CREATE TABLE [dbo].[T_GENERIC_ARTICLE]
(
RecordIdentifier int IDENTITY(1,1) PRIMARY KEY NOT NULL,
GlnCode nvarchar(100) NOT NULL,
Description nvarchar(MAX) NOT NULL,
VendorId nvarchar(100) NOT NULL,
VendorName nvarchar(100) NOT NULL,
ItemNumber nvarchar(100) NOT NULL,
ItemUOM nvarchar(128) NOT NULL
)
GO

CREATE UNIQUE NONCLUSTERED INDEX [UniqueNonClusteredIndex-Composite2]
ON [dbo].[T_GENERIC_ARTICLE](GlnCode, ItemNumber,ItemUOM,VendorId ASC);

GO

Revised Query

SELECT TOP (51) 
[RecordIdentifier] AS [RecordIdentitifer],
[GlnCode] AS [GlnCode],
[VendorId] AS [VendorId],
[ItemNumber] AS [ItemNumber],
[ItemUOM] AS [ItemUOM],
[Description] AS [Description],
[VendorName] AS [VendorName]
FROM [dbo].[T_GENERIC_ARTICLE]
ORDER BY [GlnCode], [ItemNumber], [ItemUOM], [VendorId]

First a key lookup will be performed on the Primary Key and then a Non Clustered Index Scan. This is where you want the majority of the work to be done.

Reference:
Indexes in SQL Server

Hope This helps

Query slow on SQL Server 2008 R2

What you described is a normal SQL Server caching behavior.

More information about MSSQL Server caching mechanisms you can find here

It's impossible to tell you anything else without knowing the background (example sql queries, health status of your sql server machine and actual db data structure)

UPDATE:

Your query is probably so badly optimized, that is actually a miracle it's only 2:30

SELECT *

LIKE '%%*%%'

ORDER BY

SQL Server 2008: Very slow query performance when dealing with specific date boundary

Run the slow and fast query with SET sTATISTICS IO ON and see if there is a significant difference in the amount of logical reads/physical reads between the two.

Most likely there is a strong skew in the data. For instance the plan on the fast one does a nested loop driven by a result of 10 rows (resulting in 10 nested lookups), while the slow one suddenly sees 10000 rows from where the previous one has seen 10, resulting in 10000 lookups. Although your query has no joins, the engine may use various access indexes and join the indexes with the cluster index. The actual execution plan will always reveal exactly what's going on.

SQL Server 2008 ROW_NUMBER() order by Clustered PK slow

UPDATE STATISTICS Order WITH FULLSCAN;
UPDATE STATISTICS OrderProduct WITH FULLSCAN;
UPDATE STATISTICS OrderProductDetail WITH FULLSCAN;

Finally the query went back to normal after I executed the above commands, my DBA didn't include the FULLSCAN option in first attempt therefore it wasn't working.

Thanks @Jeroen Mostert!

Sort order of an SQL Server 2008+ clustered index

There is a difference. Inserting out of Cluster Order causes massive fragmentation.

When you run the following code the DESC clustered index is generating additional UPDATE operations at the NONLEAF level.

CREATE TABLE dbo.TEST_ASC(ID INT IDENTITY(1,1) 
,RandNo FLOAT
);
GO
CREATE CLUSTERED INDEX cidx ON dbo.TEST_ASC(ID ASC);
GO

CREATE TABLE dbo.TEST_DESC(ID INT IDENTITY(1,1)
,RandNo FLOAT
);
GO
CREATE CLUSTERED INDEX cidx ON dbo.TEST_DESC(ID DESC);
GO

INSERT INTO dbo.TEST_ASC VALUES(RAND());
GO 100000

INSERT INTO dbo.TEST_DESC VALUES(RAND());
GO 100000

The two Insert statements produce exactly the same Execution Plan but when looking at the operational stats the differences show up against [nonleaf_update_count].

SELECT 
OBJECT_NAME(object_id)
,*
FROM sys.dm_db_index_operational_stats(DB_ID(),OBJECT_ID('TEST_ASC'),null,null)
UNION
SELECT
OBJECT_NAME(object_id)
,*
FROM sys.dm_db_index_operational_stats(DB_ID(),OBJECT_ID('TEST_DESC'),null,null)

There is an extra –under the hood- operation going on when SQL is working with DESC index that runs against the IDENTITY.
This is because the DESC table is becoming fragmented (rows inserted at the start of the page) and additional updates occur to maintain the B-tree structure.

The most noticeable thing about this example is that the DESC Clustered Index becomes over 99% fragmented. This is recreating the same bad behaviour as using a random GUID for a clustered index.
The below code demonstrates the fragmentation.

SELECT 
OBJECT_NAME(object_id)
,*
FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('dbo.TEST_ASC'), NULL, NULL ,NULL)
UNION
SELECT
OBJECT_NAME(object_id)
,*
FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('dbo.TEST_DESC'), NULL, NULL ,NULL)

UPDATE:

On some test environments I'm also seeing that the DESC table is subject to more WAITS with an increase in [page_io_latch_wait_count] and [page_io_latch_wait_in_ms]

UPDATE:

Some discussion has arisen about what is the point of a Descending Index when SQL can perform Backward Scans. Please read this article about the limitations of Backward Scans.



Related Topics



Leave a reply



Submit