Does SQL Server Optimize Dateadd Calculation in Select Query

Does SQL Server optimize DATEADD calculation in select query?

Surprisingly, I've found that using GETDATE() inline seems to be more efficient than performing this type of calculation beforehand.

DECLARE @sd1 DATETIME, @sd2 DATETIME;
SET @sd1 = GETDATE();

SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, @sd1)

SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, GETDATE())

SET @sd2 = DATEADD(MINUTE, -1440, @sd1);

SELECT * FROM dbo.table
WHERE datetime_column > @sd2;

If you check the plans on those, the middle query will always come out with the lowest cost (but not always the lowest elapsed time). Of course it may depend on your indexes and data, and you should not make any assumptions based on one query that the same pre-emptive optimization will work on another query. My instinct would be to not perform any calculations inline, and instead use the @sd2 variation above... but I've learned that I can't trust my instinct all the time and I can't make general assumptions based on behavior I experience in particular scenarios.

How Does Dateadd Impact the Performance of a SQL Query?

Keeping data in the datetime format using DATEADD is most likely to be quicker

Check this question: Most efficient way in SQL Server to get date from date+time?

The accepted answer (not me!) demonstrates DATEADD over string conversions. I've seen another too many years ago that showed the same

Why DATEADD slows down SQL query?

For createdate BETWEEN '2014-02-15 03:34:16' AND '2014-02-15 03:34:18' the literal values can be looked up in the column statistics to estimate the number of rows that will match.

The values of variables are not sniffed except if you use option (recompile) so SQL Server will just use heuristics to guess a number.

Presumably the plan that is derived from using the first number is different from that from using the second number.

e.g. One estimates fewer rows and uses a non covering index with lookups and the other a full scan as the estimated number of rows is above the tipping point where this option is considered cheaper.

SQL performance: does MSSQL optimise the use of date time functions or is it better to pass it as a parameter

What EF gave you for that query will work fine, even if it's a bit of unexpected syntax.

Why?

  1. WHERE timestampcolumn > DATEADD(minute, number, something_meaning_now) is a sargeable filtering term: it can use an index on timestampcolumn.
  2. SYSUTCDATETIME() is a non-deterministic function. That means SQL Server knows its return value is based on something besides its input values.

So, here's what's happening: SQL Server is calculating the date "in code" before using it, just like you might do in your code. Because SQL Server knows the computed date will change (because it's non-deterministic) every time the query is used, its cached execution plan won't bind that date to a constant, so the query cache won't bloat. It would be bound if your filter were timestampcolumn < DATEADD(minute, number, '2021-01-23 12:34').

I've done tonnage of this kind of thing in large scale production and it works correctly.

You asked about scaling up. The way to do that is to put an index on your s.LastActivity column. But, to figure out what indexes you need ...

  1. Use SSMS
  2. Choose Show Actual Query Plan.
  3. Run the query
  4. Look at the query plan. It will show a recommended index if you need one.

SQL Efficiency - Query using dateAdd Function twice; or SubQuery and DateAdd Function once; on Date BETWEEN

I don't think it matters which of the two you use. And the Execution Plans agree.

But it seems you are doing calculations to column itsm_requiredbyx and then check if the result is between two external values, @DateFrom and @DateTo. This way, all datetimes from this field are processed by the functions before the WHERE conditions can be applied and no index can be used. The second link (Ten Common SQL Programming Mistakes) in @DOK's answer provide more detailed info on why and when this happens.

If you don't do any calculations to the column, but you do the (reversed) calculations to the external values instead and then check if itsm_requiredbyx is between these two calculated values, the query can use an index of itsm_requiredbyx (and the functions will only be called twice and not for every row in the table).

Is SQL DATEDIFF(year, ..., ...) an Expensive Computation?

It depends on exactly what you are doing to be honest as to the extent of the performance hit.

For example, if you are using DATEDIFF (or indeed any other function) within a WHERE clause, then this will be a cause of poorer performance as it will prevent an index being used on that column.

e.g. basic example, finding all records in 2009

WHERE DATEDIFF(yyyy, DateColumn, '2009-01-01') = 0

would not make good use of an index on DateColumn. Whereas a better solution, providing optimal index usage would be:

WHERE DateColumn >= '2009-01-01' AND DateColumn < '2010-01-01'

I recently blogged about the difference this makes (with performance stats/execution plan comparisons), if you're interested.

That would be costlier than say returning DATEDIFF as a column in the resultset.

I would start by identifying the individual queries that are taking the most time. Check the execution plans to see where the problem lies and tune from there.

Edit:
Based on the example query you've given, here's an approach you could try out to remove the use of DATEDIFF within the WHERE clause. Basic example to find everyone who was 10 years old on a given date - I think the maths is right, but you get the idea anyway! Gave it a quick test, and seems fine. Should be easy enough to adapt to your scenario. If you want to find people between (e.g.) 15 and 17 years old on a given date, then that's also possible with this approach.

-- Assuming @Date2 is set to the date at which you want to calculate someone's age 
DECLARE @AgeAtDate INTEGER
SET @AgeAtDate = 10

DECLARE @BornFrom DATETIME
DECLARE @BornUntil DATETIME
SELECT @BornFrom = DATEADD(yyyy, -(@AgeAtDate + 1), @Date2)
SELECT @BornUntil = DATEADD(yyyy, -@AgeAtDate , @Date2)

SELECT DOB
FROM YourTable
WHERE DOB > @BornFrom AND DOB <= @BornUntil

An important note to add, is for age caculates from DOB, this approach is more accurate. Your current implementation only takes the year of birth into account, not the actual day (e.g. someone born on 1st Dec 2009 would show as being 1 year old on 1st Jan 2010 when they are not 1 until 1st Dec 2010).

Hope this helps.

How can I optimize this working TAT query?

Thanks to @Charlieface, I did come up with a fairly accurate solution using just a Date table. Since I only made that solution on Production data, I am not posting it here, but it is possible. That being said, I was only able to ameliorate the problems with that kind of solution to some extent. The questions that arose were:

  • What if something was completed on a Sunday?
  • How do we know what days were actually worked?
  • Did they work Saturday as well?

This solution did perform faster than my Second one above (about 5-7 seconds faster), however, given the little inaccuracies, I opted in favor of the Second one above. Hopefully this post can help guide others who are looking to calculate business TAT according to their own custom business logic.

TSQL Performance issues using DATEADD in where clause

This would probably be a lot easier if you used a single SMALLDATETIME column instead of separating this data into DATE/TIME columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE/TIME data as strings. Please specify the version of SQL Server and the actual data types being used.

DECLARE @d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));

INSERT @d VALUES
('2011-09-18','12:05:00',38.15,38.099),
('2011-09-18','12:10:00',38.20,38.10),
('2011-09-18','12:15:00',38.22,38.17),
('2011-09-18','12:20:00',38.21,38.20),
('2011-09-19','11:50:00',38.17,38.10),
('2011-09-19','12:55:00',38.32,38.27),
('2011-09-19','12:00:00',38.30,38.20);

SELECT '-- before update';
SELECT * FROM @d;

;WITH d(d,t,dtr,maxt) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8),
[Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM @d
),
d2(dtr, maxt) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d SET maxt = d2.maxt FROM d
INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);

SELECT '-- after update';
SELECT * FROM @d;

Results:

-- before update

2011-09-18 12:05:00 38.150 38.099
2011-09-18 12:10:00 38.200 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.210 38.200
2011-09-19 11:50:00 38.170 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.300 38.200

-- after update

2011-09-18 12:05:00 38.220 38.099
2011-09-18 12:10:00 38.220 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.220 38.200
2011-09-19 11:50:00 38.220 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.320 38.200

Presumably you want to update the MinTemp as well, and that would just be:

;WITH d(d,t,dtr,maxt,mint) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12,
CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112)
+ ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
FROM @d
),
d2(dtr, maxt, mint) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d
SET maxt = d2.maxt, mint = d2.maxt
FROM d
INNER JOIN d2
ON d.dtr >= d2.dtr
AND d.dtr < DATEADD(DAY, 1, d2.dtr);

Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:

USE [tempdb];
GO

CREATE TABLE dbo.d
(
[Date] SMALLDATETIME,
MaxTemp DECIMAL(6,3),
MinTemp DECIMAL(6,3),
RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
);

CREATE INDEX rd ON dbo.d(RoundedDate);

INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
('2011-09-18 12:05:00',38.15,38.099),
('2011-09-18 12:10:00',38.20,38.10),
('2011-09-18 12:15:00',38.22,38.17),
('2011-09-18 12:20:00',38.21,38.20),
('2011-09-19 11:50:00',38.17,38.10),
('2011-09-19 12:55:00',38.32,38.27),
('2011-09-19 12:00:00',38.30,38.20);

Then your update is this simple, and the plan is much nicer:

;WITH g(RoundedDate,MaxTemp)
AS
(
SELECT RoundedDate, MAX(MaxTemp)
FROM dbo.d
GROUP BY RoundedDate
)
UPDATE d
SET MaxTemp = g.MaxTemp
FROM dbo.d AS d
INNER JOIN g
ON d.RoundedDate = g.RoundedDate;

Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.

Performance issue with DATEADD function

I may be wrong in my understanding so take it only as a possibility - when using a function within a join criteria and/or where clause, in order to determine if data meets the criteria, it must be checked against every row in the table.

Think about your first part WHERE e.DATE <= a.joining_date - you can simply look directly at rows that are less than the e.DATE.

For your second part AND e.DATE >= DATEADD(MONTH, - 6, a.joining_date) - there is no column the is "joining date minus 6 months", so to determine if e.Date is greater than it, you would need to perform that calculation on every instance of a.joining_date in the table.

Remember that where clause information is not necessarily evaluated in the order is it written down in the query - so the rows you would think are eliminated by the first part of your where are not necessarily eliminated by it. So as one of the comments suggested, using a computed/persisted column on the DATEADD(MONTH, - 6, a.joining_date) would probably work well.



Related Topics



Leave a reply



Submit