Is it dangerous to read global variables from separate threads at potentially the same time?

问题

So I'm writing this neat little program to teach myself threading, I'm using boost::thread and C++ to do so.

I need the main thread to communicate with the worker thread, and to do so I have been using global variables. It is working as expected, but I can't help but feel a bit uneasy.

What if the the worker thread tries write to a global variable at the same time as the main thread is reading the value. Is this bad, dangerous, or hopefully taken into account behind the scenes??

回答1:

§1.10 [intro.multithread] (quoting N4140):

6 Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.

23 Two actions are potentially concurrent if

they are performed by different threads, or

they are unsequenced, and at least one is performed by a signal handler.

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

Purely concurrent reads do not conflict, and so is safe.

If at least one of the threads write to a memory location, and another reads from that location, then they conflict and are potentially concurrent. The result is a data race, and hence undefined behavior, unless appropriate synchronization is used, either by using atomic operations for all reads and writes, or by using synchronization primitives to establish a happens before relationship between the read and the write.

回答2:

If your different threads only read values of global variables, there will be no problem.

If more than one thread tries to update same variable (example read, add 1 write), then you must use a synchronization system to ensure that the value cannot be modified between the read and the write.

If only one thread writes while others read, it depends. If the different variables are unrelated, say number of apples and oranges in a basket, you do not need any synchronization, provided you accept not exactly accurate values. But if the values are related say amount of money on two bank accounts with a transfert between them, you need synchronization to ensure that what you read is coherent. It could be too old when you use it because it has already be updated but you have coherent values.

回答3:

The simple answer is yes. Once variables are starting to be shared amongs multiple threads for both reading and writing you will need some kind of protection. There are different flavours to achieve this : Semaphores, locks, mutex, events, critical section message queues. Especially when your globals are references things can become ugly. Suppose you have global list of objects in a consumers / producers scenario with multiple consumers, the producer instantiates objects, the consumer takes them, does something with them and finally disposes them, without protection of some sort this leads to terrible problems. There is a lot of specialised literature about this topic and there are dedicated college courses about this topic, and well known problems that are being given to students. For instance the dining philosefers problem, howto make a readerswritersemaphore without starvation, ... . Interesting book : the little book about semaphores.

回答4:

Yes. No. Maybe.

The formally correct answer is: This is not safe.

The practical answer is not that easy. It's something like "This is safe, kind of, under some conditions".

Reads (any number of them) in absence of concurrent writes are always safe. Reads (even a single one) in presence of concurrent writes (even a single one) are formally never safe, but they are atomic on most processors in most situations, and this can be just good enough. Changing values (like incrementing a counter) is nearly always troublesome, even in practice, without explicitly using atomic operations.

Atomicity

The C++ standard mandates that you use std::atomic or one of its specializations (or higher level synchronization primitives), or you are doomed. Demons will fly out of your nose (no, they won't... but as far as the standard goes, they might as well).

All real, non-theoretical CPUs access memory exclusively via cache lines except in very special conditions which you must expclicitly provoke (such as using write-combining instructions). An entire cache line can be read or written to, atomically, at a time -- never anything different. Reading any memory location that is being written to might not give the value that you expect (if it has been updated in the mean time), but it will never return a "garbage" value.
Now of course a variable might cross a cacheline, in which case access isn't atomic, but unless you deliberately provoke it, this will not happen (since integral variables are power-of-two sized such as 2, or 4, or 8, and cache lines are also power-of-two sized and larger such as 64 or 128 -- if your variables are properly aligned to the former as by default, they are automatically also completely contained within the latter. Always.).

Ordering

Although your reads (and writes) may be atomic, and you might say that you only care whether some flag is zero or not so who cares even if a value is garbled, you don't have a guarantee that things happen in the order that you expect!
The "normal" expectation that if you say that A happens before B, then A indeed happens before B and A can be seen by someone else before B is generally not true. In other words, it is perfectly possible that your worker thread prepares some data and then sets the ready flag. Your main thread sees that the ready flag is set, and begins reading some random garbage while the real data is still on its way somewhere in the cache hierarchy. Or maybe half of it is visible to the main thread already, but the other half isn't.

For this, C++11 introduced the concept of memory order. This means no more and no less than besides having the guarantee of atomicity, you also have a way of requesting a happens-before guarantee.
Most of the time, this only prevents the compiler from moving around loads and stores, but on some architectures, it may cause special instructions to be emitted (that's not your problem, though).

Read-Modify-Write

This is a particularly nefarious one. A simple thing like ++flag; can be desastrous. This is not at all the same as flag = 1;

Without using proper atomic instructions, this is never safe, as it involves (atomically) reading, then modifying, and then (atomically) writing a cache line.
The problem is, while reading and writing are both atomic, the whole thing isn't. Nor is there any guarantee about ordering.

Solution?

Either use std::atomic or block on a condition variable. The former will involve spinning, which may or may not be detrimental (depending on the frequency and latency requirements) while the latter will be CPU conservative.
You could use a mutex to synchronize access to the global variable, too, but if you involve a heavyweight primitive, you might as well go for the condition variable instead of spinning (which will be the "correct" approach).

回答5:

This really depends on a number of factors but is generally a bad idea and can lead to race conditions. You can avoid this by locking the value so that reads and writes are all atomic and thus can't collide.

回答6:

You must create a mutex (mutual exclusion object), only one thread at a time can own the mutex, and use it to control access to the variables. https://msdn.microsoft.com/en-us/library/z3x8b09y.aspx

回答7:

This actually points to a race condition between writer thread and reader thread. The places where we access/write the global variable would be the critical sections of the code. Ideally we must synchronize between the read/write threads whenever we operate in the critical sections or else we may see unspecific behavior in the code.

Your problem is similar to a reader-writer problem and we must synchronize using semaphores, mutex and other locking mechanisms to avoid the race condition. Assuming 1 writer and multiple reader we may use the following code to avoid undefined behavior:

// Using read and write semaphores
semaphore rd, wrt; 
int readCount;

// Writer Thread 

do
{
...
// Critical Section Starts  

wait(wrt);
    global variable = someValues;   // Write to the global Variable.
signal(wrt);

// Critical Section Ends  
...
} while(1)


// Reader thread 

do
{
...
// Critical Section 1 Starts  

wait(rd)
readcount++;
    if(readCount == 1) 
        wait(wrt);
signal(rd);

// Critical Section 1 Ends

// Do Reading 

// Critical Section 2 Starts
wait(rd)  
    readcount--;
    if(readCount == 0)
        signal(wrt);
signal(rd)
// Critical Section 2 Ends
...
} while(1)

回答8:

Concurrent writes are not safe. Concurrent read and write are always safe (assuming atomic writes), but you never know whether you've read the value before or after write.

Main thread behaves just the same as spawned threads, there's no difference at all.

So, for concurrent write you'll need mutexes.

来源：https://stackoverflow.com/questions/28591370/is-it-dangerous-to-read-global-variables-from-separate-threads-at-potentially-th

标签

c++

boost-thread