When using GETDATE() in many places, is it better to use a variable?
[NOTE: If you are going to downvote this answer, please leave a comment explaining why. It has already been downvoted many times, and finally ypercube (thank you) explained at least one reason why. I can't remove the answer because it is accepted, so you might as well help to improve it.]
According to this exchange on Microsoft, GETDATE()
switched from being constant within a query to non-deterministic in SQL Server 2005. In retrospect, I don't think that is accurate. I think it was completely non-deterministic prior to SQL Server 2005 and then hacked into something called "non-deterministic runtime constant" since SQL Server 2005". The later phrase really seems to mean "constant within a query".
(And GETDATE()
is defined as unambiguously and proudly non-deterministic, with no qualifiers.)
Alas, in SQL Server, non-deterministic does not mean that a function is evaluated for every row. SQL Server really does make this needlessly complicated and ambiguous with very little documentation on the subject.
In practice the function call is evaluated when the query is running rather than once when the query is compiled and its value changes each time it is called. In practice, GETDATE()
is only evaluated once for each expression where it is used -- at execution time rather than compile time. However, Microsoft puts rand()
and getdate()
into a special category, called non-deterministic runtime constant functions. By contrast, Postgres doesn't jump through such hoops, it just calls functions that have a constant value when executed as "stable".
Despite Martin Smith's comment, SQL Server documentation is simply not explicit on this matter -- GETDATE()
is described as both "nondeterministic" and "non-deterministic runtime constant", but that term isn't really explained. The one place I have found the term , for instance, the very next lines in the documentation say not to use nondeterministic functions in subqueries. That would be silly advice for "nondeterministic runtime constant".
I would suggest using a variable with a constant even within a query, so you have a consistent value. This also makes the intention quite clear:
You want a single value inside the query. Within a single query, you can do something like:
select . . .
from (select getdate() as now) params cross join
. . .
Actually, this is a suggestion that should evaluate only once in the query, but there might be exceptions. Confusion arises because getdate()
returns the same value on all different rows -- but it can return different values in different columns. Each expression with getdate()
is evaluated independently.
This is obvious if you run:
select rand(), rand()
from (values (1), (2), (3)) v(x);
Within a stored procedure, you would want to have a single value in a variable. What happens if the stored procedure is run as midnight passes by, and the date changes? What impact does that have on the results?
As for performance, my guess is that the date/time lookup is minimal and for a query occurs once per expression as the query starts to run. This should not really a performance issue, but more of a code-consistency issue.
Does using a function calling GETDATE() in a view consistently give dramatically worse performance than using GETDATE() directly?
@J.Mini, you said "as if making functions with GETDATE()
is always a bad idea".
It is not about the GETDATE()
. It is about any user-defined scalar function in SQL Server prior to 2019. Any user-defined scalar function in SQL Server prior to 2019 is a bad idea because of likely poor performance. When your code runs 10x or 100x slower your users will notice.
This makes the standard programming idiom "if you see yourself doing the same thing many times, then make it a function with a good name" to be a bad idea in T-SQL.
Other RDBMSs like Postgres and Oracle may behave differently and work perfectly fine performance-wise with user-defined functions.
It is just a "feature" (or, rather, a peculiarity) of SQL Server that you need to be aware of. Especially since you use this function in multiple places. All of these places (queries) are likely much slower than they could have been.
Here is a good article by Aaron Bertrand on this topic:
Encapsulating Common Code Into Scalar UDFs
When using GETDATE() in many places, is it better to use a variable?
[NOTE: If you are going to downvote this answer, please leave a comment explaining why. It has already been downvoted many times, and finally ypercube (thank you) explained at least one reason why. I can't remove the answer because it is accepted, so you might as well help to improve it.]
According to this exchange on Microsoft, GETDATE()
switched from being constant within a query to non-deterministic in SQL Server 2005. In retrospect, I don't think that is accurate. I think it was completely non-deterministic prior to SQL Server 2005 and then hacked into something called "non-deterministic runtime constant" since SQL Server 2005". The later phrase really seems to mean "constant within a query".
(And GETDATE()
is defined as unambiguously and proudly non-deterministic, with no qualifiers.)
Alas, in SQL Server, non-deterministic does not mean that a function is evaluated for every row. SQL Server really does make this needlessly complicated and ambiguous with very little documentation on the subject.
In practice the function call is evaluated when the query is running rather than once when the query is compiled and its value changes each time it is called. In practice, GETDATE()
is only evaluated once for each expression where it is used -- at execution time rather than compile time. However, Microsoft puts rand()
and getdate()
into a special category, called non-deterministic runtime constant functions. By contrast, Postgres doesn't jump through such hoops, it just calls functions that have a constant value when executed as "stable".
Despite Martin Smith's comment, SQL Server documentation is simply not explicit on this matter -- GETDATE()
is described as both "nondeterministic" and "non-deterministic runtime constant", but that term isn't really explained. The one place I have found the term , for instance, the very next lines in the documentation say not to use nondeterministic functions in subqueries. That would be silly advice for "nondeterministic runtime constant".
I would suggest using a variable with a constant even within a query, so you have a consistent value. This also makes the intention quite clear:
You want a single value inside the query. Within a single query, you can do something like:
select . . .
from (select getdate() as now) params cross join
. . .
Actually, this is a suggestion that should evaluate only once in the query, but there might be exceptions. Confusion arises because getdate()
returns the same value on all different rows -- but it can return different values in different columns. Each expression with getdate()
is evaluated independently.
This is obvious if you run:
select rand(), rand()
from (values (1), (2), (3)) v(x);
Within a stored procedure, you would want to have a single value in a variable. What happens if the stored procedure is run as midnight passes by, and the date changes? What impact does that have on the results?
As for performance, my guess is that the date/time lookup is minimal and for a query occurs once per expression as the query starts to run. This should not really a performance issue, but more of a code-consistency issue.
is there any benefit to using a created FUNCTION over declaring a variable?
The UDF (user defined function) can have a negative impact on your query-speed. If you haven't set your MAXDOP (max degree of parallelism) to 1 the query with your UDF will prevent parallelism.
You can see it in your explain-plan and try what solution is the faster one.
On the other hand the re-use of the code and the easier debugging is nice in the UDF.
If you want to have both, you could write your own CLR wich contains the logic of the UDF.
The Test I made:
CREATE FUNCTION [dbo].[GetLW]()
RETURNS datetime
AS
BEGIN
RETURN CONVERT(DATE,GETDATE())
END
GO
SELECT *
FROM [TGR].[dbo].[LargeTable]
WHERE BookingDate = [dbo].[GetLW]()
GO
DECLARE @LW datetime = CONVERT(DATE,GETDATE())
SELECT *
FROM [TGR].[dbo].[LargeTable]
WHERE BookingDate = @LW
GO
Explain Plan:
So if you are in a DWH/BI enviroment, maybe the variable is the better solution.
Regards
Tgr
Does using a function calling GETDATE() in a view consistently give dramatically worse performance than using GETDATE() directly?
@J.Mini, you said "as if making functions with GETDATE()
is always a bad idea".
It is not about the GETDATE()
. It is about any user-defined scalar function in SQL Server prior to 2019. Any user-defined scalar function in SQL Server prior to 2019 is a bad idea because of likely poor performance. When your code runs 10x or 100x slower your users will notice.
This makes the standard programming idiom "if you see yourself doing the same thing many times, then make it a function with a good name" to be a bad idea in T-SQL.
Other RDBMSs like Postgres and Oracle may behave differently and work perfectly fine performance-wise with user-defined functions.
It is just a "feature" (or, rather, a peculiarity) of SQL Server that you need to be aware of. Especially since you use this function in multiple places. All of these places (queries) are likely much slower than they could have been.
Here is a good article by Aaron Bertrand on this topic:
Encapsulating Common Code Into Scalar UDFs
SSIS: variable expression using time in two places
Don't use GETDATE()
as it will be evaluated each time you inspect it.
Instead, pick a value that will be constant for the duration of the SSIS package but is updated when the package begins. My preference is @[System::StartTime]
but some may prefer @[System::ContainerStartTime]
The former is set when the package begins execution, the other is reset inside each container (ForEach Loop, etc)
How to get date to be the same day and month from one variable but in the year based on another variable?
If the question is "given I have a particular datetime value in one variable, can I set another variable to be for the same day and month but in the current year" then the answer would be:
declare @DueDate datetime
declare @NewDate datetime
set @DueDate = '20141130'
--Need to set @NewDate to the same month and day in the current year
set @NewDate = DATEADD(year,
--Here's how you work out the offset
DATEPART(year,CURRENT_TIMESTAMP) - DATEPART(year,@DueDate),
@DueDate)
select @DueDate,@NewDate
I want to get the a new date variable having the date and the month same as @DueDate variable but the year as given in the @Currentdate variable.
Well, that's simply the above query with a single tweak:
set @NewDate = DATEADD(year,
--Here's how you work out the offset
DATEPART(year,@Currentdate) - DATEPART(year,@DueDate),
@DueDate)
Is there a simpler method for getting the two digit year?
Honestly, just don't work with 2 digit years any more; learn the lessons of last century's mistake and use 4 digits.
If you "have" to, then you could use CONVERT
with a style code, and then just replace the characters with an empty string:
DECLARE @TestVariable varchar(100) = '1234',
@Date datetime = GETDATE();
SELECT @TestVariable + REPLACE(CONVERT(varchar(8),@Date, 1),'/','');
TSQL - Trying to use variable with IF EXISTS() Function for simple SP
You need to use dynamic sql to achieve this
http://www.mssqltips.com/sqlservertip/1160/execute-dynamic-sql-commands-in-sql-server/
e.g.:
DECLARE @DynamicSQl NVARCHAR(MAX), @retVal INT
SET @DynamicSQl = 'select @retVal = 1 from (' + @selectCommand + ') t'
EXEC sp_executesql @DynamicSQl, N'@retVal INT OUTPUT', @retVal output
IF (@retVal = 1)
RETURN 1
ELSE
RETURN 0
How to Speed up SQL query for Date
declare @startDate DateTime = DATEADD(MONTH, DATEDIFF(MONTH,0, GETDATE())-1,0 )
declare @endDate DateTime = DATEADD(MS, -3,DATEADD(MM, DATEDIFF(M,-1, GETDATE()) -1, 0))
select
custKey,
sum(salesAmt) as Sales,
sum(returnAmt) as Credit,
(sum(salesAmt) - sum(returnAmt)) as CONNET
from
[SpotFireStaging].[dbo].[tsoSalesAnalysis]
inner join
[SpotFireStaging].[dbo].OOGPLensDesc as o on tsoSalesAnalysis.ItemKey = O.ItemKey
where
PostDate between @startDate AND @endDate
group by
custkey
another alternative, check out the selected answer here:
When using GETDATE() in many places, is it better to use a variable?
GetDate()
is calculated separately for each row, so we gotta belive so is DateDiff()
and DateAdd()
. So we are better off moving it into a local variable.
Related Topics
What Is the Most Appropriate Data Type for Storing an Ip Address in SQL Server
SQL Join on Multiple Columns in Same Tables
When to Use SQL Sub-Queries Versus a Standard Join
How to Batch SQL Statements with Package Database/Sql
Psql Invalid Command \N While Restore SQL
SQL Server 2005 - Export Table Programmatically (Run a .SQL File to Rebuild It)
What Is the Meaning of Select ... for Xml Path(' '),1,1)
SQL Insert into from Multiple Tables
T-Sql: Using a Case in an Update Statement to Update Certain Columns Depending on a Condition
Select Rows with Same Id But Different Value in Another Column
Do I Need to Create Indexes on Foreign Keys on Oracle
Why Do SQL Server Scalar-Valued Functions Get Slower
How to Automatically Generate Unique Id in SQL Like Uid12345678
Flattening of a 1 Row Table into a Key-Value Pair Table
How to Make a View Column Not Null
How to Check If a Column Exists Before Adding It to an Existing Table in Pl/Sql