Double-Checked Lock Singleton in C++11

Double-checked locking in C++11?

I'll ignore the singleton bits for a while and assume you need this for lazy initialization and not for silly things like singletons.

I suggest forgetting about double-checked locking. C++ provides a very useful tool for this kind of situation in the form of std::call_once, so use that.

template <typename T>
struct lazy {
public:
    // needs constraining to prevent from doing copies
    // see: http://flamingdangerzone.com/cxx11/2012/06/05/is_related.html
    template <typename Fun>
    explicit lazy(Fun&& fun) : fun(std::forward<Fun>(fun)) {}

    T& get() const {
         std::call_once(flag, [this] { ptr.reset(fun()); });
         return *ptr;
    }
    // more stuff like op* and op->, implemented in terms of get()

private:
    std::once_flag flag;
    std::unique_ptr<T> ptr;
    std::function<T*()> fun;
};

// --- usage ---

lazy<foo> x([] { return new foo; });

Double-Checked Lock Singleton in C++11

That implementation is not race-free. The atomic store of the singleton, while it uses release semantics, will only synchronize with the matching acquire operation—that is, the load operation that is already guarded by the mutex.

It's possible that the outer relaxed load would read a non-null pointer before the locking thread finished initializing the singleton.

The acquire that is guarded by the lock, on the other hand, is redundant. It will synchronize with any store with release semantics on another thread, but at that point (thanks to the mutex) the only thread that can possibly store is the current thread. That load doesn't even need to be atomic—no stores can happen from another thread.

See Anthony Williams' series on C++0x multithreading.

Is implementation of double checked singleton thread-safe?

It's not safe by C++ concurrency rules, since the first read of pInstance is not protected by a lock or something similar and thus does not synchronise correctly with the write (which is protected). There is therefore a data race and thus Undefined Behaviour. One of the possible results of this UB is precisely what you've identified: the first check reading a garbage value of pInstance which is just being written by a different thread.

The common explanation is that it saves on acquiring the lock (a potentially time-expensive operation) in the more common case (pInstance already valid). However, it's not safe.

Since C++11 and beyond guarantees initialisation of function-scope static variables happens only once and is thread-safe, the best way to create a singleton in C++ is to have a static local in the function:

Singleton& Singleton::instance() {
   static Singleton s;
   return s;
}

Note that there's no need for either dynamic allocation or a pointer return type.

As Voo mentioned in comments, the above assumes pInstance is a raw pointer. If it was std::atomic<Singleton*> instead, the code would work just fine as intended. Of course, it's then a question whether an atomic read is that much slower than obtaining the lock, which should be answered via profiling. Still, it would be a rather pointless exercise, since the static local variables is better in all but very obscure cases.

Double-checked locking and the Singleton pattern

You can access disassembly in Debug->Windows->Disassembly.

For class:

class S
{
public:
    static S& getInstance()
    {
        static S instance;
        return instance;
    }
};

You get disassembly:

47: 
48: class S
49: {
50: public:
51:     static S& getInstance()
52:     {
push        ebp  
mov         ebp,esp  
sub         esp,0C0h  
push        ebx  
push        esi  
push        edi  
lea         edi,[ebp-0C0h]  
mov         ecx,30h  
mov         eax,0CCCCCCCCh  
rep stos    dword ptr es:[edi]  
53:         static S instance;
54:         return instance;
mov         eax,offset instance (0D9471h)  
55:     }
pop         edi  
pop         esi  
pop         ebx  
mov         esp,ebp  
pop         ebp  
ret

Double-check locking issues, c++

The problem you describe can only occur if for reasons I cannot imagine the conceptors of the singleton uses an explicit (and broken) 2 steps construction:

     ...
     Guard myGuard(lock_); 
     if (!pInstance_) 
     {
        auto alloc = std::allocator<Singleton>();
        pInstance_ = alloc.allocate(); // SHAME here: race condition
        // eventually other stuff
        alloc.construct(_pInstance);   // anything could have happened since allocation
     }
     ....

Even if for any reason such a 2 step construction was required, the _pInstance member shall never contain anything else that nullptr or a fully constructed instance:

        auto alloc = std::allocator<Singleton>();
        Singleton *tmp = alloc.allocate(); // no problem here
        // eventually other stuff
        alloc.construct(tmp);              // nor here
        _pInstance = tmp;                  // a fully constructed instance

But beware: the fix is only guaranteed on a mono CPU. Things could be much worse on multi core systems where the C++11 atomics semantics are indeed required.

What the correct way when use Double-Checked Locking with memory barrier in c++?

No, it isn't safe. Reading the three paragraphs before the example, and the two after it, the potential problem is a system where the write to pInstance is done (flushed to memory) on thread B before the construction of Singleton has been flushed. Then thread A could read pInstance, see the pointer as non-null, and return it potentially allowing thread A to access the Singleton before thread B has finished storing it into memory.

The first flush is necessary to ensure that the flushing of the writes during construction of Singleton have been completed before you try to use it in a different thread.

Depending on the hardware you're running on this might not be a problem.

Double-Checked Locking Pattern in C++11?

If pInstance is a regular pointer, the code has a potential data race -- operations on pointers (or any builtin type, for that matter) are not guaranteed to be atomic (EDIT: or well-ordered)

If pInstance is an std::atomic<Singleton*> and Lock internally uses an std::mutex to achieve synchronization (for example, if Lock is actually std::lock_guard<std::mutex>), the code should be data race free.

Note that you need both explicit locking and an atomic pInstance to achieve proper synchronization.

Double-Checked Lock Singleton in C++11

Double-checked locking in C++11?

Double-Checked Lock Singleton in C++11

Is implementation of double checked singleton thread-safe?

Double-checked locking and the Singleton pattern

Double-check locking issues, c++

What the correct way when use Double-Checked Locking with memory barrier in c++?

Double-Checked Locking Pattern in C++11?

Related Topics

Leave a reply