SQL Server Deterministic User-Defined Function

Is there any way to make this UDF deterministic?

Sure, I can think of one way to make it deterministic. Deploy this function on your production database:

ALTER FUNCTION [TheSchema].[udf_IS_PRODUCTION] ()
RETURNS bit
WITH SCHEMABINDING
AS
BEGIN
RETURN CONVERT(bit, 1)
END

And deploy this one to your test database:

ALTER FUNCTION [TheSchema].[udf_IS_PRODUCTION] ()
RETURNS bit
WITH SCHEMABINDING
AS
BEGIN
RETURN CONVERT(bit, 0)
END

This may seem silly but IMO the database name should not be "hard coded" any more so than the return value of some UDF.

Better yet, just put this information in a configuration table somewhere.

user defined function (TSQL) non deterministic and computed persisted column

The value is nondeterministic because it's dependent on the current date. You can't persist computed columns whose values can change over time.

See this article on MSDN for more information on Deterministic Functions.

Why does SQL Server say this function is nondeterministic?

One of the key points for SQL Server to mark a function as Deterministic is the SchemaBinding feature. For your function to be deterministic you need to define the function using With SchemaBinding.

In Your example if you remove the With SchemaBinding, the ObjectProperty function will return 0 for the IsDeterministic attribute, so by adding the With SchemaBinding the problem will be resolved for you

@Paul has detailed explanation around this issue, here

Deterministic/Volatile function in SQL

There are different interpretations for determinism, even when restricted to the SQL functions domain. It depends on what determinism consumer needs and assumes.

The usual definition of determinism is that a deterministic function always return the same value when confronted with same input argument values for its parameters.

If the function consumes state, it would implictly consider it as an extra input paramenter. The original function(p1,...pn) would become function(p1,...pn,state). But in this case if two different states are compared, then the inputs would not be the same, so we couldn't talk about determinism anymore. Knowing this, we will use the terms state-sensitive-determinism and state-insensitive-determinism to differentiate those cases.

Our state-insensitive-determinism is equivalent of PostgreSQL's IMMUTABLE (PostgreSQL is a good comparinson as it avoids using the term determinism to avoid confusion, as it is possible to see in postgresql docs). In this case, the function always returns the same value no matter the state (example select 1+2). It is the most strict form of determinism and consumers usually take it for granted - query optimizers for example can substitute them by their result (select 1+2 would become select 3). In those cases, the state does not influence the result. So, even if we put state as an extra parameter, the function remains resulting the same.

When the result does not change facing the same state but risk changing otherwise we have our state-sensitive-determinism or PostgreSQL's STABLE (example select v,sum(v) over () from tbl where v>1000;). Determinism here is on a gray area. A query optimizer consumer sees it as deterministic because since query lives a well defined state, at least in transactionable databases, it is fine to calculate it only once instead of many times because future calculations would result the same. But a materialized calculated column or index can't accept this same function as deterministic because a little change in the state would turn invalid all its pre-calculated and stored values. In this scenario resides the OP's getTableCount(otherTbl). For a query optimizer its deterministism is enough to avoid extra calculations, for materialized calculated values it is not enough and can't be accepted as a source of value for being written. If we use the state as an extra parameter, the result may change between different states.

If we consume a value that is generated based on some uncontrolled state like random() (at least when we don't choose seed and pseudorandom function), then we can't achieve determinism. In PostgreSQL's terms, this would be VOLATILE. A VOLATILE is undeterministic by nature because it can have different values even in the same table scan, as it is the case of random() (For time related functions see Postgres now() timestamp doesn't change, when script works, the time may be the transaction time or can be the query time, what would impact your view of what is deterministic).

MySQL have different keywords, NOT DETERMINISTIC DETERMINISTIC, READS SQL DATA MODIFIES SQL DATA (similiar to PostgreSQL's LEAKPROOF), NO SQL CONTAINS SQL as seen on mysql docs, with the same objective of PostgreSQL - giving hints to the specific consumer, be it a query optimizer or a materialized value, of whether it would adapt its behaviour depending on its interpretation of determinism. The database vendors probably leave this responsibility to the users because leaving them the responsibility of determining the causal graph what influences what would be complex and problematic.

When vendors talk about determinsim they will probably be talking about one of those that we said. In sqlserver docs microsoft says that state must be the same, so they are probably talking about our state-sensitive-determinism. In sqlite docs otherwise it is taken the state-insensitive-determinism approach, where functions that must result equally even in different states to be considered deterministic, because they would follow stricter rules. Oracle implicitly follows the same sqlite flavor in their docs.

Our transactionable databases will eventually use some mechanism like MVCC to hold state in a transaction. In this case we could think the transactionTimestamp as a input to our functions. But if we take more complex cases like distributed databases, then our determinism can be harder to achieve and eventualy it would have to consider consensus algorithms.

SQL server using computed column and user defined function to grab datetime based on change in another column

You shouldn't try to compute a column from itself.

Instead, I'd use a trigger...

CREATE TRIGGER dbo.log__set_active_date
ON dbo.log
AFTER INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;

UPDATE
log
SET
active_date = INSERTED.last_updated
FROM
dbo.log
INNER JOIN
INSERTED
ON log.id = INSERTED.id
WHERE
INSERTED.code = 10
AND log.active_date IS NULL -- Added to ensure the value is only ever copied ONCE
END

db<>fiddle demo

Deterministic Scalar Functions

Since you say you're using a number of user-defined functions, have you tried creating a function-based index on each one separately? That should at least narrow it down.

SQL Server - DATE conversion from DATETIME is non-deterministic but only in user-defined function

As mentioned at the beginning of this somewhat related answer, not specifying WITH SCHEMABINDING means SQL Server skips checks on such things as determinism and data access.

Since PERSISTED in a computer column requires the "computed column expression" to be deterministic and SQL Server skips any checks on whether or not it actually is deterministic, it won't be allowed. The same error would occur even if you had something as simple as i AS 1 PERSISTED.

(This is unrelated to whether everything in the function itself is deterministic.)

All that said, using PERSISTED in a TVF doesn't actually add anything to the function, as far as I know.

What does it mean by Non-deterministic User-Defined functions can be used in a deterministic manner ?

the BOL actually states:

The following functions are not
always deterministic
, but can be
used in indexed views or indexes on
computed columns when they are
specified in a deterministic manner.

and then below it states what conditions must be met to make them deterministic.

E.g.

CAST - Deterministic unless used with
datetime, smalldatetime, or
sql_variant

In other words you need to meet those condition to use them in deterministic manner

For example when you create a table

CREATE TABLE [dbo].[deterministicTest](
[intDate] [int] NULL,
[dateDateTime] [datetime] NULL,
[castIntToDateTime] AS (CONVERT([datetime],[intDate],0)),
[castDateTimeToInt] AS (CONVERT([int],[dateDateTime],0)),
[castIntToVarchar] AS (CONVERT([varchar],[intDate],0))
) ON [PRIMARY]

you can apply index on castIntToVarchar but if you try to add index to castDateTimeToInt or castIntToDateTime you will get the following error:

Column 'castDateTimeToInt'(castIntToDateTime) in table 'dbo.deterministicTest' cannot be used in an index or statistics or as a partition key because it is non-deterministic.

So the dateTime cannot be used neither as a source nor the target format of the CONVERT function if you want to stay deterministic

Creating Nondeterministic functions in SQL Server using RAND()

Because it has side effects.

Constructs with side effects are not allowed in a function. The side effect that it has is to change some internal state that keeps track of the last rand() value issued.

I think you can get around it by including it in a View definition then selecting from the View.



Related Topics



Leave a reply



Submit