How SQL Server creates uniqueidentifier using NEWID()
Yes, there is no chance of a duplicate between machines.
NEWID()
is based on a combination of a pseudo random number (from the clock) and the MAC address of the primary NIC.
However, inserting random numbers like this as the clustered key on a table is terrible for performance. You should consider either NEWSEQUENTIALID()
or a COMB-type function for generating GUIDs that still offer the collision-avoidance benefits of NEWID()
while still maintaining acceptable INSERT
performance.
Does manipulating NEWID() still make it unique?
The NEWID()
will be unique, but the result will not be. The NEWID()
is 36 characters. This is using the checksum, which is a value up to about 4 billion and then converting it back to a sequence of numbers.
There are way more NEWID()
values than possible values for this expression. By something called the pigeonhole principle, there will be duplicated values.
NEWID() is generating the same value for all records in a relation
Because you bound value to variable you get always the same value, you need to generate it per row. Use:
INSERT INTO [MetaData Dummy].dbo.xxxx (someid, somename, LastUpdated, RecordSource)
SELECT NEWID(), TempCol, GETDATE(), xxx
FROM AP_DevIDs
MS SQL table with GUID All Zeros
uniqueidentifier data type does not mean that it will be unique. If you generate NEWID()
then it generates unique id but again there is always a probability that same id may be generated.
For 0's
insert into t values ('00000000-0000-0000-0000-000000000000');
insert into t values ('00000000-0000-0000-0000-000000000000');
insert into t values (newid());
statements are valid. If your uid column is not primary key or has unique index on it, dublicate keys can be added to the table.
If you add check constraint to your table you can restrict and also identify the root cause of the problem
create table t (
id uniqueidentifier unique
CONSTRAINT CHK_uid CHECK (id != '00000000-0000-0000-0000-000000000000')
);
GO
✓
insert into t values ('00000000-0000-0000-0000-000000000000');
insert into t values ('00000000-0000-0000-0000-000000000000');
insert into t values ('00000000-0000-0000-0000-000000000000');
insert into t values (newid());
GO
Msg 547 Level 16 State 0 Line 1
The INSERT statement conflicted with the CHECK constraint "CHK_uid". The conflict occurred in database "fiddle_80c5a5fe96ab4e73ac5dafbb2256025d", table "dbo.t", column 'id'.
Msg 547 Level 16 State 0 Line 2
The INSERT statement conflicted with the CHECK constraint "CHK_uid". The conflict occurred in database "fiddle_80c5a5fe96ab4e73ac5dafbb2256025d", table "dbo.t", column 'id'.
Msg 547 Level 16 State 0 Line 3
The INSERT statement conflicted with the CHECK constraint "CHK_uid". The conflict occurred in database "fiddle_80c5a5fe96ab4e73ac5dafbb2256025d", table "dbo.t", column 'id'.
Msg 3621 Level 0 State 0 Line 1
The statement has been terminated.
Msg 3621 Level 0 State 0 Line 2
The statement has been terminated.
Msg 3621 Level 0 State 0 Line 3
The statement has been terminated.
select * from t
GO
| id |
| :----------------------------------- |
| ddeb79f6-dc0f-4c6a-a065-2083d39a78c1 |
db<>fiddle here
Why does using NEWID() use more space than NEWSEQUENTIALID()?
This is related to partitioning. Basically, newId()
will create GUIDs in random order, which means that you're inserting into the middle of the table all the time. Sequential IDs, on the other hand, will always append to the end of the table, which is much simpler.
If you want to know more, look at some materials on paging. A good start might be the official MSDN page on MS SQL paging - http://technet.microsoft.com/en-us/library/ms190969(v=sql.105).aspx
You also have to understand that rows are inherently organized by ID in the physical files that store the database data. A file with no spaces between IDs (such as when using identity columns and no deletion) can take less space to store the same amount of data.
I'd expect that a full shrink of the database will significantly reduce the amount of space lost to fragmentation in MyGuid table, while it will do very little to MyGuidSeq size.
If you can use sequential GUIDs, do so - they improve INSERT efficiency a lot, and by extension, indices can also be less fragmented and smaller overall.
You're not showing the "time taken" debug outputs, but I expect that those are significantly different as well (even though this can be very much offset by the memory available to the database - it doesn't need to change the data files immediately; if you want to know more about this, look up something about transaction logs).
Is uniqueidentifier unique across databases?
If you use the built-in methods of uniqueidentifier
generation (like newId()
or C#'s Guid.NewGuid()
), yes, it will be unique across databases, servers, countries, whatever.
In fact, that's one of the big uses of GUIDs - replication. If you have the same GUID in two databases, it's guaranteed that it was put there on purpose.
However, do note that GUIDs do have their shortcomings - they might make your indices perform worse (or at least require more maintenance), and they are bigger in general.
Also, GUIDs aren't entirely random - there's a few different GUID generating algorithms, some of which are inherently unique (e.g. using a MAC address as part of the GUID - MACs are unique by default, although you can override them manually) - so one part is unique per physical server, and the machine makes sure it doesn't use the same timestamp for two GUIDs. There's also sequential GUIDs (newSequentialId()
), which are handy in avoiding index fragmentation (very useful for clustered indices of course) - do note that those depend on the MAC address of the computer, and they are predictable, so if you're making the GUID public, and you depend on them being "secret", you might not want to use those. Some GUID algorithms are more predictable than others.
How to automatically generate unique id in SQL like UID12345678?
The only viable solution in my opinion is to use
- an
ID INT IDENTITY(1,1)
column to get SQL Server to handle the automatic increment of your numeric value - a computed, persisted column to convert that numeric value to the value you need
So try this:
CREATE TABLE dbo.tblUsers
(ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
UserID AS 'UID' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) PERSISTED,
.... your other columns here....
)
Now, every time you insert a row into tblUsers
without specifying values for ID
or UserID
:
INSERT INTO dbo.tblUsersCol1, Col2, ..., ColN)
VALUES (Val1, Val2, ....., ValN)
then SQL Server will automatically and safely increase your ID
value, and UserID
will contain values like UID00000001
, UID00000002
,...... and so on - automatically, safely, reliably, no duplicates.
Update: the column UserID
is computed - but it still OF COURSE has a data type, as a quick peek into the Object Explorer reveals:
Issues with getting uniqueidentifier auto generate in SQL on insert
In case Table2 is not yet created you can use this query:
SELECT
NEWID() AS [ID]
,Description AS [Item Description]
,Cost AS [Item Cost]
INTO Table2
FROM Table1
WHERE Id = '1'
But, in case the schema for Table 2 has already been created, and it has a column for the identifier then you can use:
INSERT INTO Table2 ([ID], [Item Description], [Item Cost])
SELECT
NEWID()
, Description
, Cost
FROM Table1
WHERE Id = '1'
But if you haven't created the schema for Table2 yet, then I recommend using the following code to create both the schema for the table and populate the data in it, using data from Table1
.
CREATE TABLE Table2 (
[ID] INT Identity
,[Item Description] AS NVARCHAR(MAX)
,[Item Cost] AS NUMERIC(18, 2))
INSERT INTO Table2([Item Description] , [Item Cost])
SELECT
Description
, Cost
FROM Table1
WHERE Id = '1'
Setting the ID column as Identity will auto-generate a UNIQUE identifier number for each row which you don't have to worry about populating. The number is incremental and it increments by default by 1 for each row, starting from 1.
You can also define a starting number and a incremental value by defining the [ID] column as [ID] INTEGER IDENTITY(1000, 2)
and this will make the starting value 1000 and the increment will be 2 (but this is just FYI).
Related Topics
The Version of SQL Server in Use Does Not Support Datatype Datetime2
Anonymizing Customer Data for Development or Testing
Using Hibernate's Criteria and Projections to Select Multiple Distinct Columns
How to Store a String Var Greater Than Varchar(Max)
How to Escape Ampersand in Toad
Custom Order by to Ignore 'The'
How to Add Offset in a "Select" Query in Oracle 11G
Inventory Average Cost Calculation in SQL
Maintaining Subclass Integrity in a Relational Database
Multiple Constraints in Table: How to Get All Violations
What SQLite Column Name Can Be/Cannot Be
Can SQL Server SQL_Latin1_General_Cp1_Ci_As Be Safely Converted to Latin1_General_Ci_As