Simple Proof That Guid Is Not Unique

Simple proof that GUID is not unique

Kai, I have provided a program that will do what you want using threads. It is licensed under the following terms: you must pay me $0.0001 per hour per CPU core you run it on. Fees are payable at the end of each calendar month. Please contact me for my paypal account details at your earliest convenience.

using System;
using System.Collections.Generic;
using System.Linq;

namespace GuidCollisionDetector
{
class Program
{
static void Main(string[] args)
{
//var reserveSomeRam = new byte[1024 * 1024 * 100]; // This indeed has no effect.

Console.WriteLine("{0:u} - Building a bigHeapOGuids.", DateTime.Now);
// Fill up memory with guids.
var bigHeapOGuids = new HashSet<Guid>();
try
{
do
{
bigHeapOGuids.Add(Guid.NewGuid());
} while (true);
}
catch (OutOfMemoryException)
{
// Release the ram we allocated up front.
// Actually, these are pointless too.
//GC.KeepAlive(reserveSomeRam);
//GC.Collect();
}
Console.WriteLine("{0:u} - Built bigHeapOGuids, contains {1} of them.", DateTime.Now, bigHeapOGuids.LongCount());


// Spool up some threads to keep checking if there's a match.
// Keep running until the heat death of the universe.
for (long k = 0; k < Int64.MaxValue; k++)
{
for (long j = 0; j < Int64.MaxValue; j++)
{
Console.WriteLine("{0:u} - Looking for collisions with {1} thread(s)....", DateTime.Now, Environment.ProcessorCount);
System.Threading.Tasks.Parallel.For(0, Int32.MaxValue, (i) =>
{
if (bigHeapOGuids.Contains(Guid.NewGuid()))
throw new ApplicationException("Guids collided! Oh my gosh!");
}
);
Console.WriteLine("{0:u} - That was another {1} attempts without a collision.", DateTime.Now, ((long)Int32.MaxValue) * Environment.ProcessorCount);
}
}
Console.WriteLine("Umm... why hasn't the universe ended yet?");
}
}
}

PS: I wanted to try out the Parallel extensions library. That was easy.

And using OutOfMemoryException as control flow just feels wrong.

EDIT

Well, it seems this still attracts votes. So I've fixed the GC.KeepAlive() issue. And changed it to run with C# 4.

And to clarify my support terms: support is only available on the 28/Feb/2010. Please use a time machine to make support requests on that day only.

EDIT 2
As always, the GC does a better job than I do at managing memory; any previous attempts at doing it myself were doomed to failure.

Is a GUID unique 100% of the time?


While each generated GUID is not
guaranteed to be unique, the total
number of unique keys (2128 or
3.4×1038) is so large that the probability of the same number being
generated twice is very small. For
example, consider the observable
universe, which contains about 5×1022
stars; every star could then have
6.8×1015 universally unique GUIDs.

From Wikipedia.


These are some good articles on how a GUID is made (for .NET) and how you could get the same guid in the right situation.

https://ericlippert.com/2012/04/24/guid-guide-part-one/

https://ericlippert.com/2012/04/30/guid-guide-part-two/

https://ericlippert.com/2012/05/07/guid-guide-part-three/

​​

how a GUID is generated to be unique and is it always unique?

GUIDS aren't guaranteed to be unique, however they are said to be "statistically unique" since the probability of creating the same guid twice is statistically insignificant. For all intensive purposes, you'll be safe to assume that any GUID you create (properly) is unique.

As to how GUIDs are generated, they use different algorithms depending on the type of GUID. For instance, SQL Server's "uniqueidentifier" datatype uses a V4 GUID which is based on a pseudorandom data generator whereas other versions of GUIDs can be based on hash functions or a combination of MAC adress and time.

Is it possible to generate a duplicate GUID? Well of course, but the chances of it happening are just statistically insignificant.

Trust the GUID. It exists to make your life easier :)

Always unique Guid not unique enough


I have a client which REQUIRES me to generate completely unique ids which NEVER, EVER, duplicate.

That's of course not possible. See the Pigeonhole principle.

A UUID is a 128-bit integer, so there can be 2^128 unique IDs. So if you generate 2^128 + 1 IDs, you have at least one duplicate.


For real world scenarios, Guid.NewGuid() is good enough.

Is it safe to assume a GUID will always be unique?

Yes, you can. Since GUIDs are 128 bits long, there is admittedly a minute possibility of a clash—but the word "minute" is nowhere near strong enough. There are so many GUIDs that if you generate several trillion of them randomly, you're still more likely to get hit by a meteorite than to have even one collision (from Wikipedia). And if you aren't generating them randomly, but are e.g. using the MAC-address-and-time-stamp algorithm, then they're also going to be unique, as MAC addresses are unique among computers and time stamps are unique on your computer.

Edit 1: To answer your bonus question, the optimal way to test a set of GUIDs for uniqueness is to just assume that they are all are unique. Why? Because, given the number of GUIDs you're generating, the odds of a GUID collision are smaller than the odds of a cosmic ray flipping a bit in your computer's memory and screwing up the answer given by any "accurate" algorithm you'd care to run. (See this StackOverflow answer for the math.)

There are an enormous number of GUIDs out there. To quote Douglas Adams's Hitchhiker's Guide to the Galaxy:

"Space," it says, "is big. Really big. You just won't believe how vastly hugely mindbogglingly big it is. I mean you may think it's a long way down the road to the chemist, but that's just peanuts to space, listen…"

And since there are about 7×1022 stars in the universe, and just under 2128 GUIDs, then there are approximately 4.86×1015—almost five quadrillion—GUIDs for every single star. If every one of those stars had a world with a thriving population like ours, then around each and every star, every human or alien who had ever lived would be entitled to over forty-five thousand GUIDs. For every person in history at every star in the universe. The GUID space is at the same level of hugeness as the size of the entire universe. You do not need to worry.

(Edit 2: Reflecting on this: wow. I hadn't realized myself what this meant. The GUID space is incomprehensibly massive. I'm sort of in awe of it.)

How to make sure a generated guid is unique globally?

It isnt, but the way it is generated and the way it is represented makes probability of generating two same GUIDs in this milenium almost zero.

See: Simple proof that GUID is not unique

Do I need to verify the uniqueness of a GUID?

This should be completely unneccessary - the whole point of GUIDs is to eliminate the need for these sorts of checks :-)

You may be interesting in reading this interesting post on GUID generation algorithms:

  • GUIDs are globally unique, but substrings of GUIDs aren't (The Old New Thing)

The goal of this algorithm is to use the combination of time and location ("space-time coordinates" for the relativity geeks out there) as the uniqueness key. However, timekeeping is not perfect, so there's a possibility that, for example, two GUIDs are generated in rapid succession from the same machine, so close to each other in time that the timestamp would be the same. That's where the uniquifier comes in. When time appears to have stood still (if two requests for a GUID are made in rapid succession) or gone backward (if the system clock is set to a new time earlier than what it was), the uniquifier is incremented so that GUIDs generated from the "second time it was five o'clock" don't collide with those generated "the first time it was five o'clock".

The only real way that you might ever have a collision is if someone was generating thousands of GUIDs on the same machine while also repeatedly setting the timestamp back to the same exact point in time.

How to make sure a generated guid is unique globally?

It isnt, but the way it is generated and the way it is represented makes probability of generating two same GUIDs in this milenium almost zero.

See: Simple proof that GUID is not unique

Are Guids guaranteed to be unique on creation?

Guids can be considered unique, as per the other answers.

Given that, the problem will lie in the ClientId property which must be either static or something similar (even ThreadStatic), which will cause the logon sessions to appear to "blend":

  • Session 1 Login ClientId == GuidA
  • Session 2 Login ClientId == GuidB
  • Session 1 Service Call ClientId == GuidB (as it was overridden by Session 2)

Edit

So just to clarify, static variables are available to all threads (therefore all requests) in an AppDomain. Therefore there is only one such variable, and each request is updating that same variable. How to avoid this in your scenario? ... that would be a different question.



Related Topics



Leave a reply



Submit