Is Order Guaranteed When Inserting Multiple Rows with Identity

Is Order Guaranteed When Inserting Multiple Rows with Identity?

The very similar question was asked before.

You can specify an ORDER BY in the INSERT.

If you do that, the order in which the IDENTITY values are generated is guaranteed to match the specified ORDER BY in the INSERT.

Using your example:

DECLARE @blah TABLE
(
ID INT IDENTITY(1, 1) NOT NULL,
Name VARCHAR(100) NOT NULL
);

INSERT INTO @blah (Name)
SELECT T.Name
FROM
(
VALUES
('Timmy'),
('Jonny'),
('Sally')
) AS T(Name)
ORDER BY T.Name;

SELECT
T.ID
,T.Name
FROM @blah AS T
ORDER BY T.ID;

The result is:

+----+-------+
| ID | Name |
+----+-------+
| 1 | Jonny |
| 2 | Sally |
| 3 | Timmy |
+----+-------+

That is, Name have been sorted and IDs have been generated according to this order. It is guaranteed that Jonny will have the lowest ID, Timmy will have the highest ID, Sally will have ID between them. There may be gaps between the generated ID values, but their relative order is guaranteed.

If you don't specify ORDER BY in INSERT, then resulting IDENTITY IDs can be generated in a different order.

Mind you, there is no guarantee for the actual physical order of rows in the table even with ORDER BY in INSERT, the only guarantee is the generated IDs.

In a question INSERT INTO as SELECT with ORDER BY Umachandar Jayachandran from MS said:

The only guarantee is that the identity values will be generated based
on the ORDER BY clause. But there is no guarantee for the order of
insertion of the rows into the table.

And he gave a link to Ordering guarantees in SQL Server, where Conor Cunningham from SQL Server Engine Team says:


  1. INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which
    the rows are inserted

There is a link to MS knowledge base article in the comments in that post: The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause, which explains it in more details. It says:

If you want the IDENTITY values to be assigned in a sequential fashion
that follows the ordering in the ORDER BY clause, create a table that
contains a column with the IDENTITY property and then run an INSERT ... SELECT ... ORDER BY query to populate this table.

I would consider this KB article as an official documentation and consider this behaviour guaranteed.

Order guarantee for identity assignment in multi-row insert in SQL Server

Piggybacking on my comment above, and knowing that the behavior of an insert / select+order by will guarantee generation of identity order (#4: from this blog)

You can use the table value constructor in the following fashion to accomplish your goal (not sure if this satisfies your other constraints) assuming you wanted your identity generation to be based on category id.

insert into thetable(CategoryId, CategoryName)
select *
from
(values
(101, 'Bikes'),
(103, 'Clothes'),
(102, 'Accessories')
) AS Category(CategoryID, CategoryName)
order by CategoryId

Does SQL Server guarantee sequential inserting of an identity column?

Guaranteed as in absolutely under no circumstances whatsoever could you possibly get a value that might be less than or equal to the current maximum value? No, there is no such guarantee. That said, the circumstances under which that scenario could happen are limited:

  1. Someone disables identity insert and inserts a value.
  2. Someone reseeds the identity column.
  3. Someone changes the sign of the increment value (i.e. instead of +1 it is changed to -1)

Assuming none of these circumstances, you are safe from race conditions creating a situation where the next value is lower than an existing value. That said, there is no guarantee that the rows will be committed in the order that of their identity values. For example:

  1. Open a transaction, insert into your table with an identity column. Let's say it gets the value 42.
  2. Insert and commit into the same table another value. Let's say it gets value 43.

Until the first transaction is committed, 43 exists but 42 does not. The identity column is simply reserving a value, it is not dictating the order of commits.

How can I insert multiple rows into a table and get all new identity valued in order?

Yes. the insert will always work, once you include the order by, the insert will be executed in that order.

Here I change the staging order, btw you dont need OUTPUT

SQL DEMO

insert #Staging (TrackingId, Value) values (201,1000),(204,2000),(203,2000),(202,1000);
^ ^ ^ ^

INSERT INTO #Target (Value <, otherfields>)
SELECT TrackingID <, otherfields>
FROM #Staging
ORDER BY TrackingID
;

SELECT *
FROM #Target;

Please read the comments below in that article the answer from the author:

  • Could you elaborate on statement #4.

Yes, the identity values will be generated in the sequence established by the ORDER BY. If a clustered index exists on the identity column, then the values will be in the logical order of the index keys. This still doesn’t guarantee physical order of insertion. Index maintenance is a different step and that could also be done in parallel for example. So you could end up generating the identity values based on ORDER BY clause and then feeding those rows to the clustered index insert operator which will perform the maintenance task. You can see this in the query plan. You should really NOT think about physical operations or order but instead think of a table as a unordered set of rows. The index can be used to sort rows in logical manner (using ORDER BY clause) efficiently.

Is an IDENTITY column auto-incremented before or after an order by clause is applied to it?

The SQL Server Engine Team have made this blog post:

INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted

They clarify what "the order in which the rows are inserted" means in the comments:

Yes, the identity values will be generated in the sequence established by the ORDER BY. If a clustered index exists on the identity column, then the values will be in the logical order of the index keys. This still doesn't guarantee physical order of insertion. Index maintenance is a different step and that could also be done in parallel for example. So you could end up generating the identity values based on ORDER BY clause and then feeding those rows to the clustered index insert operator which will perform the maintenance task.

Can I use a SQL Server identity column to determine the inserted order of rows?

Largely yes, as long as you don't ever reset it or insert rows with bulk copy, or use IDENTITY_INSERT. And of course assuming that you don't overflow the data-type (which could be impressive).

SQL Insert multiple rows for every id returned from another table if the row does not exist for that id

I would suggest an INSERT... SELECT like so:

INSERT into table2 (table1_id, column3, column4)
SELECT t1.id, s.str, s.str
FROM table1 AS t1
CROSS JOIN (SELECT "fizz" AS str UNION SELECT "buzz" UNION SELECT "hello world") AS s
LEFT JOIN table2 AS t2 ON t1.id = t2.table1_id AND s.str = t2.column3
WHERE t1.thing1 = true
AND t2.id IS NULL -- Only insert when they are not already present
;

However, this will not guarantee the strings are inserted in the order you've shown.

I have not had much call to use CROSS JOINs, so I am not sure how well they play with LEFT JOINs, so if the above does not work out quite right, here are some alternatives below:

INSERT into table2 (table1_id, column3, column4)
SELECT t1.id, s.str, s.str
FROM table1 AS t1
CROSS JOIN (SELECT "fizz" AS str UNION SELECT "buzz" UNION SELECT "hello world") AS s
WHERE t1.thing1 = true
AND (t2.id, s.str) NOT IN (SELECT table1_id, column3 FROM table2 )
;

or

INSERT into table2 (table1_id, column3, column4)
SELECT t1.id, s.str, s.str
FROM table1 AS t1
CROSS JOIN (SELECT "fizz" AS str UNION SELECT "buzz" UNION SELECT "hello world") AS s
WHERE t1.thing1 = true
AND NOT EXISTS (
SELECT *
FROM table2 AS t2
WHERE t2.table1_id = t1.id AND t2.column3 = s.str
)
;

If Sql Server, the union subquery (including it's surrounding parenthesis and alias) can be replaced with (VALUES ('fizz'), ('buzz'), ('hello word')) AS s(str).



Related Topics



Leave a reply



Submit