What's wrong with this fix for double checked locking?

戏子无情 提交于 2019-12-30 06:47:53

问题


So I've seen a lot of articles now claiming that on C++ double checked locking, commonly used to prevent multiple threads from trying to initialize a lazily created singleton, is broken. Normal double checked locking code reads like this:

class singleton {
private:
    singleton(); // private constructor so users must call instance()
    static boost::mutex _init_mutex;

public:
    static singleton & instance()
    {
        static singleton* instance;

        if(!instance)
        {
            boost::mutex::scoped_lock lock(_init_mutex);

            if(!instance)           
                instance = new singleton;
        }

        return *instance;
    }
};

The problem apparently is the line assigning instance -- the compiler is free to allocate the object and then assign the pointer to it, OR to set the pointer to where it will be allocated, then allocate it. The latter case breaks the idiom -- one thread may allocate the memory and assign the pointer but not run the singleton's constructor before it gets put to sleep -- then the second thread will see that the instance isn't null and try to return it, even though it hasn't been constructed yet.

I saw a suggestion to use a thread local boolean and check that instead of instance. Something like this:

class singleton {
private:
    singleton(); // private constructor so users must call instance()
    static boost::mutex _init_mutex;
    static boost::thread_specific_ptr<int> _sync_check;

public:
    static singleton & instance()
    {
        static singleton* instance;

        if(!_sync_check.get())
        {
            boost::mutex::scoped_lock lock(_init_mutex);

            if(!instance)           
                instance = new singleton;

            // Any non-null value would work, we're really just using it as a
            // thread specific bool.
            _sync_check = reinterpret_cast<int*>(1);
        }

        return *instance;
    }
};

This way each thread ends up checking if the instance has been created once, but stops after that, which entails some performance hit but still not nearly so bad as locking every call. But what if we just used a local static bool?:

class singleton {
private:
    singleton(); // private constructor so users must call instance()
    static boost::mutex _init_mutex;

public:
    static singleton & instance()
    {
        static bool sync_check = false;
        static singleton* instance;

        if(!sync_check)
        {
            boost::mutex::scoped_lock lock(_init_mutex);

            if(!instance)           
                instance = new singleton;

            sync_check = true;
        }

        return *instance;
    }
};

Why wouldn't this work? Even if sync_check were to be read by one thread when it's being assigned in another the garbage value will still be nonzero and thus true. This Dr. Dobb's article claims that you have to lock because you'll never win a battle with the compiler over reordering instructions. Which makes me think this must not work for some reason, but I can't figure out why. If the requirements on sequence points are as lose as the Dr. Dobb's article makes me believe, I don't understand why any code after the lock couldn't be reordered to be before the lock. Which would make C++ multithreading broken period.

I guess I could see the compiler being allowed to specifically reorder sync_check to be before the lock because it's a local variable (and even though it's static we're not returning a reference or pointer to it) -- but then this could still be solved by making it a static member (effectively global) instead.

So will this work or won't it? Why?


回答1:


Your fix doesn't fix anything since the writes to sync_check and instance can be done out of order on the CPU. As an example imagine the first two calls to instance happen at approximately the same time on two different CPUs. The first thread will acquire the lock, initialize the pointer and set sync_check to true, in that order, but the processor may change the order of the writes to memory. On the other CPU then it is possible for the second thread to check sync_check, see that it is true, but instance may not yet be written to memory. See Lockless Programming Considerations for Xbox 360 and Microsoft Windows for details.

The thread specific sync_check solution you mention should work then (assuming you initialize your pointer to 0).




回答2:


There's some great reading about this (although it's .net/c# oriented) here: http://msdn.microsoft.com/en-us/magazine/cc163715.aspx

What it boils down to is that you need to be able to tell the CPU that it cannot reorder your reads/writes for this variable access (ever since the original Pentium, the CPU can reorder certain instructions if it thinks that the logic would be unaffected), and that it needs to ensure that the cache is consistent (don't forget about that -- we devs get to pretend that all memory is just one flat resource, but in reality, each CPU core has cache, some unshared (L1), some might be shared sometimes (L2)) -- your initizlization might write to main RAM, but another core might have the uninitialized value in cache. If you don't have any concurrency semantics, the CPU may not know that it's cache is dirty.

I don't know the C++ side, but in .net, you would designate the variable as volatile in order to protect access to it (or you would use the Memory read/write barrier methods in System.Threading).

As an aside, I've read that in .net 2.0, double checked locking is guaranteed to work without "volatile" variables (for any .net readers out there) -- that doesn't help you with your c++ code.

If you want to be safe, you will need to do the c++ equivalent of marking a variable as volatile in c#.




回答3:


"The latter case breaks the idiom -- two threads might end up creating the singleton."

But if I understand the code correctly, the first example, you check if instance already exists (might be executed by multiple threads at the same time), if it doesn't one thread get's to lock it and it creates the instance - only one thread can execute the creation at that time. All other threads get locked out and will wait.

Once the instance is created and the mutex is unlocked the next waiting thread will lock mutex but it will not try to create new instance because the check will fail.

Next time the instance variable is checked it will be set so no threads will try to create new instance.

I'm not sure about the case where one thread is assigning new instance pointer to instance while another thread checks the same variable - but I believe it will be handled correctly in this case.

Am I missing something here?

Ok not sure about the reordering of operations but in this case it would be altering logic so I would not expect it to happen - but I'm no expert on this topic.



来源:https://stackoverflow.com/questions/945232/whats-wrong-with-this-fix-for-double-checked-locking

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!