Thread Safe Lazy Construction of a Singleton in C++

Thread safe lazy construction of a singleton in C++

Basically, you're asking for synchronized creation of a singleton, without using any synchronization (previously-constructed variables). In general, no, this is not possible. You need something available for synchronization.

As for your other question, yes, static variables which can be statically initialized (i.e. no runtime code necessary) are guaranteed to be initialized before other code is executed. This makes it possible to use a statically-initialized mutex to synchronize creation of the singleton.

From the 2003 revision of the C++ standard:

Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (3.9) with static storage duration initialized with constant expressions (5.19) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.

If you know that you will be using this singleton during the initialization of other static objects, I think you'll find that synchronization is a non-issue. To the best of my knowledge, all major compilers initialize static objects in a single thread, so thread-safety during static initialization. You can declare your singleton pointer to be NULL, and then check to see if it's been initialized before you use it.

However, this assumes that you know that you'll use this singleton during static initialization. This is also not guaranteed by the standard, so if you want to be completely safe, use a statically-initialized mutex.

Edit: Chris's suggestion to use an atomic compare-and-swap would certainly work. If portability is not an issue (and creating additional temporary singletons is not a problem), then it is a slightly lower overhead solution.

Make Meyers' Singleton thread safe and fast with lazy evaluation

So seems the answer to my question best expressed Fred Larson in his comment:

"Fast, thread-safe, lazy - pick any two."

Thread Safe singleton in C++ using static member instance (no lazy instantiation)

If you change the singleton instance to be a static class member, the creation of that instance will be thread safe. Whatever you do with it, will of cause still need to be protected in many cases.

Also see When are static C++ class members initialized?

Portable thread-safe lazy singleton

No, this will not work. It is broken.

The problem has little/nothing to do with the compiler. It has to do with the order in which a second CPU will 'see' what the first CPU has done to memory. The memory (and caches) will be consistent, but the timing of WHEN each CPU decides to write or read each part of memory/cache is indeterminate.

So for CPU1:

instance = new Singleton( instanceCreated );
instance->setInstanceCreated();

Let's consider the compiler first. There is NO reason why the compiler doesn't reorder or otherwise alter these functions. Maybe like:

temp_register = new Singleton(instanceCreated);
temp_register->setInstanceCreated();
instance = temp_register;

or many other possibilities - like you said as long as single-threaded observed behaviour is consistent. This DOES include things like " break up the ctor into two functions and call one part of it before the following call and the other part after."

Now, it probably wouldn't break it up into 2 calls, but it would INLINE the ctor, particularly since it is so small. Then, once inlined, everything may be reordered, as if the ctor was broken in 2, for example.

In general, I would say not only is it possible that the compiler reordered things, it is probable - ie for the code you have, there is probably a reordering (once inlined, and inlining is likely) that is 'better' than the order given by the C++ code.

But let's leave that aside, and try to understand the real issues of double-checked locking.
So, let's just assume the compiler didn't reorder anything. What about the CPU? Or more importantly CPUs - plural.

The first CPU, 'CPU1' needs to follow the instructions given by the compiler, in particular, it needs to write to memory the things it has been told to write:

instance,
instanceCreated
other member variable of the Singleton (ie your Singleton does DO something, and has some state, doesn't it?)

Actually, that 'other member variable' stuff is really important. Important for your singleton - that's its real purpose right?, and important for our discussion. So let's give it a name: important_data. ie instance->important_data. And maybe instance->important_function(), which uses important_data. Etc.

As mentioned, let's assume the compiler has written the code such that these items are written in the order you are expecting, namely:

important_data - written inside the ctor, called from
instance = new Singleton(instanceCreated);
instance - assigned right after new/ctor returns
instanceCreated - inside setInstanceCreated()

Now, the CPU hands these writes off to the memory bus. Know what the memory bus does? IT REORDERS THEM. The CPU and architecture has the same constraints as the compiler - ie make sure this one CPU sees things consistently - ie single threaded consistent. So if, for example, instance and instanceCreated are on the same cache-line (highly likely, actually), they might be written together, and since they were just read, that cache-line is 'hot', so maybe they get written FIRST before important_data, so that that cache-line can be retired to make room for the cache-line where important_data lives.

Did you see that? instanceCreated and instance were just committed to memory BEFORE important_data. Note that CPU1 doesn't care, because it is living in a single-threaded world...

So now introduce CPU2:

CPU2 comes in, sees instanceCreated == true and instance != NULL and thus goes off and decides to call Singleton::Instance()->important_function(), which uses important_data, which is uninitialized. CRASH BANG BOOM.

By the way, it gets worse. So far, we've seen that the compiler could reorder, but we're pretending it didn't. Let's go one step further and pretend that CPU1 did NOT reorder any of the memory writing. Are we OK now?

No. Of course not.

Just as CPU1 decided to optimize/reorder its memory writes, CPU2 can REORDER ITS READS!

CPU2 comes in and sees

if (!instanceCreated) ...

so it needs to read instanceCreated. Ever heard of 'speculative execution'? (Great name for a FPS game, by the way). If the memory bus isn't busy doing anything, CPU2 might pre-read some other values 'hoping' that instanceCreated is true. ie it may pre-read important_data for example. Maybe important_data (or the uninitialized, possibly re-claimed-by-the-allocator memory that will become important_data) is already in CPU2's cache. Or maybe (more likely?) CPU2 just free'd that memory, and the allocator wrote NULL in its first 4 bytes (allocators often use that memory for their free-lists), so actually, the memory soon-to-become important_data may actually still be in the write queue of CPU2. In that case, why would CPU2 bother re-reading that memory, when it hasn't even finished writing it yet!? (it wouldn't - it would just get the values from its write-queue.)

Did that make sense? If not, imagine that the value of instance (which is a pointer) is 0x17e823d0. What was that memory doing before it became (becomes) the Singleton? Is that memory still in the write-queue of CPU2?...

Or basically, don't even think about why it might want to do so, but realize that CPU2 might read important_data first, then instanceCreated second. So even though CPU1 may have wrote them in order CPU2 sees 'crap' in important_data, then sees true in instanceCreated (and who knows what in instance!). Again, CRASH BANG BOOM. Or BOOM CRASH BANG, since by now you realize that the order isn't guaranteed...

efficient thread-safe singleton in C++

Your solution is called 'double checked locking' and the way you've written it is not threadsafe.

This Meyers/Alexandrescu paper explains why - but that paper is also widely misunderstood. It started the 'double checked locking is unsafe in C++' meme - but its actual conclusion is that double checked locking in C++ can be implemented safely, it just requires the use of memory barriers in a non-obvious place.

The paper contains pseudocode demonstrating how to use memory barriers to safely implement the DLCP, so it shouldn't be difficult for you to correct your implementation.

How to implement multithread safe singleton in C++11 without using mutex

C++11 removes the need for manual locking. Concurrent execution shall wait if a static local variable is already being initialized.

§6.7 [stmt.dcl] p4

If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.

As such, simple have a static function like this:

static Singleton& get() {
  static Singleton instance;
  return instance;
}

This will work all-right in C++11 (as long as the compiler properly implements that part of the standard, of course).

Of course, the real correct answer is to not use a singleton, period.

Is this a valid, lazy, thread-safe Singleton implementation for C#?

Jon Skeet has written an article about implementing the Singleton pattern in C#.

The lazy implementation is version 5:

public sealed class Singleton
{
    Singleton()
    {
    }

    public static Singleton Instance
    {
        get
        {
            return Nested.instance;
        }
    }

    class Nested
    {
        // Explicit static constructor to tell C# compiler
        // not to mark type as beforefieldinit
        static Nested()
        {
        }

        internal static readonly Singleton instance = new Singleton();
    }
}

Notice in particular that you have to explicitly declare a constructor even if it is empty in order to make it private.

How can I create a thread-safe singleton pattern in Windows?

If you are are using Visual C++ 2005/2008 you can use the double checked locking pattern, since "volatile variables behave as fences". This is the most efficient way to implement a lazy-initialized singleton.

From MSDN Magazine:

Singleton* GetSingleton()
{
    volatile static Singleton* pSingleton = 0;

    if (pSingleton == NULL)
    {
        EnterCriticalSection(&cs);

        if (pSingleton == NULL)
        {
            try
            {
                pSingleton = new Singleton();
            }
            catch (...)
            {
                // Something went wrong.
            }
        }

        LeaveCriticalSection(&cs);
    }

    return const_cast<Singleton*>(pSingleton);
}

Whenever you need access to the singleton, just call GetSingleton(). The first time it is called, the static pointer will be initialized. After it's initialized, the NULL check will prevent locking for just reading the pointer.

DO NOT use this on just any compiler, as it's not portable. The standard makes no guarantees on how this will work. Visual C++ 2005 explicitly adds to the semantics of volatile to make this possible.

You'll have to declare and initialize the CRITICAL SECTION elsewhere in code. But that initialization is cheap, so lazy initialization is usually not important.

Why is this singleton construction code NOT thread safe?

If two threads run the if (_instance == null) check at the same time while there's no singleton instance created, they will both try to invoke new to create the singleton instance and store the references to them into the same variable.

Since the reference they will try to store to will be shared between threads this action will not be thread-safe. Also it might happen that creating two instances of the singletone class will break the program.

Thread Safe Lazy Construction of a Singleton in C++