How to Create Deterministic Guids

How to Create Deterministic Guids

As mentioned by @bacar, RFC 4122 §4.3 defines a way to create a name-based UUID. The advantage of doing this (over just using a MD5 hash) is that these are guaranteed not to collide with non-named-based UUIDs, and have a very (very) small possibility of collision with other name-based UUIDs.

There's no native support in the .NET Framework for creating these, but I posted code on GitHub that implements the algorithm. It can be used as follows:

Guid guid = GuidUtility.Create(GuidUtility.UrlNamespace, filePath);

To reduce the risk of collisions with other GUIDs even further, you could create a private GUID to use as the namespace ID (instead of using the URL namespace ID defined in the RFC).

How to create a deterministic uniqueidentifier (GUID) from an integer value

Stop thinking about the problem from a "string" perspective. an int is made up of 4 bytes. A uniqueidentifier is made up of 16 bytes. you can easily take 12 fixed bytes and append the four bytes from an int to the end of those, and get a solution that works for all int values:

declare @Unit table
(
UniqueColumn UNIQUEIDENTIFIER DEFAULT NEWID(),
Characters VARCHAR(10),
IntegerId int
)

-- Add *3* data rows
INSERT INTO @Unit(Characters, IntegerId) VALUES ('abc', 1111),('def', 2222),('ghi',-17)

-- Deterministically creates a uniqueidentifier value out of an integer value.
DECLARE @GuidPrefix binary(12) = 0xefbeadde0000000000000000
UPDATE @Unit
SET UniqueColumn = CONVERT(uniqueidentifier,@GuidPrefix + CONVERT(binary(4),IntegerId))

-- Check the result
SELECT * FROM @Unit

Result:

UniqueColumn                         Characters IntegerId
------------------------------------ ---------- -----------
DEADBEEF-0000-0000-0000-000000000457 abc 1111
DEADBEEF-0000-0000-0000-0000000008AE def 2222
DEADBEEF-0000-0000-0000-0000FFFFFFEF ghi -17

(For various reasons, we have to provide the first four bytes in a different order than the one that is used by default when displaying a uniqueidentifier as a string, which is why if we want to display DEADBEEF, we had to start our binary as efbeadde)

Also, of course, insert usual warnings that if you're creating guids/uniqueidentifiers but not using one of the prescribed methods for generating them, then you cannot assume any of the usual guarantees about uniqueness.

Generating UUID based on strings

No, what you propose is not valid because it fundamentally breaks how UUIDs work. Use a real UUID for your namespace.

A convenient (and valid) way to accomplish this is hierarchical namespaces. First, use the standard DNS namespace UUID plus your domain name to generate your root namespace:

Guid nsDNS = new Guid("6ba7b810-9dad-11d1-80b4-00c04fd430c8");
Guid nsRoot = Guid.Create(nsDNS, "myapp.example.com", 5);

Then create a namespace UUID for your string:

Guid nsFoo = Guid.Create(nsRoot, "Foo", 5);

Now you're ready to use your new Foo namespace UUID with individual names:

Guid bar = Guid.Create(nsFoo, "Bar", 5);

The benefit of this is that anyone else will get completely different UUIDs than you, even if their strings (other than the domain, obviously) are identical to yours, preventing collisions if your data sets are ever merged, yet it's completely deterministic, logical and self-documenting.

(Note: I've never actually used C#, so if I got the syntax slightly wrong, feel free to edit. I think the pattern is clear regardless.)

How deterministic Are .Net GUIDs?

It's not a complete answer, but I can tell you that the 13th hex digit is always 4 because it denotes the version of the algorithm used to generate the GUID (id est, v4); also, and I quote Wikipedia:

Cryptanalysis of the WinAPI GUID
generator shows that, since the
sequence of V4 GUIDs is pseudo-random,
given the initial state one can
predict up to the next 250 000 GUIDs
returned by the function UuidCreate.
This is why GUIDs should not be used
in cryptography, e.g., as random keys.

The rest of the article, and its references: http://en.wikipedia.org/wiki/Guid

--Edit--

From a security standpoint, I'd suggest that you generate your session ID however you feel like, then cryptographically sign it; that way you can pack in whatever information you want and then just slap a signature on the end - the possible issue being the tradeoff between the size/strength of your key and the resulting size of the cookie. GUIDs are useful as IDs, but I'd only rely on a dedicated cryptographic technique for security.

Create a consistant GUID or unique identifier from a string

Creating Guid from SHA256 hash seem like an easy option:

var guid = new Guid(
System.Security.Cryptography.SHA256.Create()
.ComputeHash(Encoding.UTF8.GetBytes("70c3bdc5ceeac673")).Take(16).ToArray());

Code discards half of hash result, but it does not change the fact that the same string is always transformed to the same Guid.

Alternatively depending on your requirements just converting string to byte array and padding with 0/removing extra may be enough.

How do i deterministically generate n unique numbers within a range from a GUID?

A guid is effectively a 128-bit number. So you can easily do this provided that the number of bits required to represent your numbers are fewer than the number of bits in the guid (128). You don't need to hash the guid or anything like that.

EDIT:

Now that I know what you need (i.e. a unique seed to be derived from a guid, you could do it this way) - but you could equally hand out a 32-bit number and avoid the guid-to-int conversion.

EDIT2: Using GetHashCode as per suggestion from comments above.

EDIT 3: Producing unique numbers.

 static void Main(string[] args)
{
var guid = new Guid("bdc39e63-5947-4704-9e12-ec66c8773742");
Console.WriteLine(guid);
var numbers = FindNumbersFromGuid(guid, 16, 8);

Console.WriteLine("Numbers: ");
foreach (var elem in numbers)
{
Console.WriteLine(elem);
}
Console.ReadKey();
}

private static int[] FindNumbersFromGuid(Guid input,
int maxNumber, int numberCount)
{
if (numberCount > maxNumber / 2) throw new ArgumentException("Choosing too many numbers.");
var seed = input.GetHashCode();
var random = new Random(seed);
var chosenSoFar = new HashSet<int>();
return Enumerable.Range(0, numberCount)
.Select(e =>
{
var ret = random.Next(0, maxNumber);
while (chosenSoFar.Contains(ret))
{
ret = random.Next(0, maxNumber);
}
chosenSoFar.Add(ret);
return ret;
}).ToArray();
}

Deterministic GUID generation in T-SQL

Your best answer here is to generate a script that is a list of insert statements.
This script can be generated from another script that generates the dynamic SQL string.
You will probably do this iteratively which calls your guid function each loop.

This way at the end you will have a list of inserts that can be run multiple times with the same results with guids.

But one of the dimensions of generating guids is time so you will never be able to reproduce the guid (which is kind of the point really)



Related Topics



Leave a reply



Submit