Odd SQL Server 2012 Identity Issue

Odd SQL Server 2012 IDENTITY issue

This blog post has some additional details. It looks like in 2012, identity is implemented as a sequence. And by default, a sequence has a cache. If the cache is lost you lose the sequence values in the cache.

The proposed solution is to create a sequence with no cache:

CREATE SEQUENCE TEST_Sequence
    AS INT
    START WITH 1
    INCREMENT BY 1
    NO CACHE

As far as I can see, the sequence behind an identity column is invisible. You can't change it's properties to disable caching.

To use this with Entity Framework, you could set the primary key's StoredGeneratedPattern to Computed. Then you could generate the identity server-side in an instead of insert trigger:

if exists (select * from sys.sequences where name = 'Sequence1')
    drop sequence Sequence1
if exists (select * from sys.tables where name = 'Table1')
    drop table Table1
if exists (select * from sys.triggers where name = 'Trigger1')
    drop trigger Trigger1
go
create sequence Sequence1
    as int
    start with 1
    increment by 1
    no cache
go
create table Table1
    (
    id int primary key,
    col1 varchar(50)
    )
go
create trigger Trigger1
    on Table1
    instead of insert
as
insert  Table1
        (ID, col1)
select  next value for Sequence1
,       col1
from    inserted
go
insert Table1 (col1) values ('row1');
insert Table1 (col1) values ('row2');
insert Table1 (col1) values ('row3');

select  * 
from    Table1

If you find a better solution, let me know :)

SQL Server identity column error

This can only happen, if you have deleted many rows recently and so your IDENTITY column will start from the last seed value. Yes, you can reseed the identity column using DBCC command utility like below but that's unnecessary. Let, the identity column increase in it's normal way and you shouldn't be much worried about that.

DBCC CHECKIDENT ('table_name', RESEED, 25);

Odd SQL Server (TSQL) query results with NEWID() in the WHERE clause

Let's have a look at the execution plan.

plan

seek

In this particular run of the query the Seek returned 51 rows instead of estimated 1.

The following actual query produces the plan with the same shape, but it is easier to analyse it, because we have two variables @ID1 and @ID2, which you can track in the plan.

CREATE TABLE #wtfTable (WtfId UNIQUEIDENTIFIER PRIMARY KEY);

INSERT INTO #wtfTable 
SELECT TOP(500) NEWID()
FROM master.sys.all_objects o1 (NOLOCK)
CROSS JOIN master.sys.all_objects o2 (NOLOCK);

DECLARE @ID1 UNIQUEIDENTIFIER;
DECLARE @ID2 UNIQUEIDENTIFIER;

SELECT TOP(1) @ID1 = WtfId
FROM #wtfTable
ORDER BY WtfId;

SELECT TOP(1) @ID2 = WtfId
FROM #wtfTable
ORDER BY WtfId DESC;

-- ACTUAL QUERY:
SELECT *
FROM #wtfTable
WHERE WtfId IN (@ID1, @ID2);

DROP TABLE #wtfTable;

If you examine closely operators in this plan you'll see that IN part of the query is converted into a table with two rows and three columns. The Concatenation operator returns this table. Each row in this helper table defines a range for seeking in the index.

ExpFrom    ExpTo    ExpFlags
@ID1       @ID1     62
@ID2       @ID2     62

Internal ExpFlags specify what kind of range seek is needed (<, <=, >, >=). If you add more variables to IN clause you'll see them in the plan concatenated to this helper table.

Sort and Merge Interval operators make sure that any possible overlapping ranges are merged. See detailed post about Merge Interval operator by Fabiano Amorim which examines the plans with this shape. Here is another good post about this plan shape by Paul White.

In the end the helper table with two rows is joined with the main table and for each row in the helper table there is a range seek in the clustered index from ExpFrom to ExpTo, which is shown in the Index Seek operator. The Seek operator shows < and >, but it is misleading. The actual comparison is defined internally by the Flags value.

If you had some different set of ranges, for example:

WHERE 
    ([WtfId] >= @ID1 AND [WtfId] < @ID2)
    OR [WtfId] = @ID3

, you would still see the same shape of the plan with the same seek predicate, but different Flags values.

So, there are two seeks:

from @ID1 to @ID1, which returns one row
from @ID2 to @ID2, which returns one row

In the query with variables internal expressions lead to getting values from the variables when needed. The value of the variable doesn't change during the query execution and everything behaves correctly as expected.

How `NEWID()` affects it

When we use NEWID as in your example:

SELECT *
FROM #wtfTable
WHERE WtfId IN ('00000000-0000-0000-0000-000000000000', NEWID());

the plan and all internal processing is the same as for variables.

The difference is that this internal table effectively becomes:

ExpFrom    ExpTo    ExpFlags
0...0      0...0    62
NEWID()    NEWID()  62

NEWID() is called two times. Naturally, each call produces a different value, which by chance results in a range that covers some existing values in the table.

There are two range scans of the clustered index with ranges

from `0...0`     to `0...0`
from `some_id_1` to `some_id_2`

Now it is easy to see how such query can return some rows, even though the chances of NEWID collision is very small.

Apparently, optimiser thinks that it can call NEWID twice instead of remembering the first generated random value and using it further in the query. There have been other cases when optimiser called NEWID more times than expected producing similar seemingly impossible results.

For example:

Is it legal for SQL Server to fill PERSISTED columns with data that does not match the definition?

Inconsistent results with NEWID() and PERSISTED computed column

Optimiser should know that NEWID() is non-deterministic. Overall, it feels like a bug.

I don't know anything about SQL Server internals, but my wild guess looks like this: There are runtime constant functions like RAND(). NEWID() was put into this category by mistake. Then somebody noticed that people do not expect it to return the same ID in the same fashion as RAND() returns the same random number for each invocation. And they patched it by actually regenerating new ID each time NEWID() appears in expressions. But overall rules for optimiser remained the same as for RAND(), so higher level optimiser thinks that all invocations of NEWID() return the same value and freely rearranges expressions with NEWID() which leads to unexpected results.

There is another question about a similar strange behaviour of NEWID():

NEWID() In Joined Virtual Table Causes Unintended Cross Apply Behavior

The answer says that there is a Connect bug report and it is closed as "Won't fix". The comments from Microsoft essentially say that this behaviour is by design.

The optimizer does not guarantee timing or number of executions of
scalar functions. This is a long-estabilished tenet. It's the
fundamental 'leeway' tha allows the optimizer enough freedom to gain
significant improvements in query-plan execution.

Identity column increment jump

It happens when SQL server 2012 loses its pre-allocated sequence numbers.

If you want to get rid of that, one option is to use traceflag:

DBCC TRACEON (272)

Another option is to use a sequence (with no caching) instead of identity:

CREATE SEQUENCE MySeq AS int
  START WITH 1
  INCREMENT BY 1
  NO CACHE;

See this: http://www.big.info/2013/01/how-to-solve-sql-server-2012-identity.html

How to set a value for even and odd rows?

Just another option ...

Example

;with cte as (
Select *
      ,RN= Row_Number() over (Order By db_name,server_name) % 2
 From  YourTable
 )
 Update cte
    Set approver = case when RN=1 then 'ApproverX' else 'ApproverY' end

The Updated Table

db_name server_name approver
cube1   server1     ApproverX
cube1   server2     ApproverY
cube2   server3     ApproverX
cube2   server4     ApproverY

LAST_VALUE in SQL Server 2012 is returning weird results

SQL Server doesn't know or care about the order in which rows were inserted into the table. If you need specific order, always use ORDER BY. In your example ORDER BY is ambiguous, unless you include PK into the ORDER BY. Besides, LAST_VALUE function can return odd results if you are not careful - see below.

You can get your expected result using MAX or LAST_VALUE (SQLFiddle). They are equivalent in this case:

SELECT
    PK, Id1, Id2
    ,MAX(PK) OVER (PARTITION BY Id1, Id2) AS MaxValue
    ,LAST_VALUE(PK) OVER (PARTITION BY Id1, Id2 ORDER BY PK rows between unbounded preceding and unbounded following) AS LastValue
FROM
    Data
ORDER BY id1, id2, PK

Result of this query will be the same regardless of the order in which rows were originally inserted into the table. You can try to put INSERT statements in different order in the fiddle. It doesn't affect the result.

Also, LAST_VALUE behaves not quite as you'd intuitively expect with default window (when you have just ORDER BY in the OVER clause). Default window is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, while you'd expected it to be ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Here is a SO answer with a good explanation. The link to this SO answer is on MSDN page for LAST_VALUE. So, once the row window is specified explicitly in the query it returns what is needed.

If you want to know the order in which rows were inserted into the table, I think, the most simple way is to use IDENTITY. So, definition of your table would change to this:

CREATE TABLE Data 
(PK INT IDENTITY(1,1) PRIMARY KEY,
Id1 INT,
Id2 INT)

When you INSERT into this table you don't need to specify the value for PK, the server would generate it automatically. It guarantees that generated values are unique and growing (with positive increment parameter), even if you have many clients inserting into the table at the same time simultaneously. There may be gaps between generated values, but the relative order of the generated values will tell you which row was inserted after which row.

Odd Error Using HierarchyID with VB

I found the answer. I added the following to my app.config file:

<runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
        <dependentAssembly> 
            <assemblyIdentity name="Microsoft.SqlServer.Types" publicKeyToken="89845dcd8080cc91" culture="neutral" />
            <bindingRedirect oldVersion="10.0.0.0" newVersion="13.0.0.0" /> 
        </dependentAssembly>
    </assemblyBinding>
</runtime>

Note the "newVersion" attribute. That is what tells Visual Studio which version of Microsoft.SqlServer.Types to use.

SQL Server 2012 Express Edition Database Size

I can see that the first query is to get the overall size of the database for the data and logs. The second one is for each table. So I would say yes to both.

Based upon my experience seeing db's over 40GB and this linkmaximum DB size limits that the limit on sql server express is based upon the mdf and ndf files not the ldf.

You might be safer however, just to go with SQL Server Standard and use CAL licensing in case your database starts growing.

Good Luck!

Odd SQL Server 2012 Identity Issue