I recently stumbled upon this Wikipedia article. From my experience with multi-threading I am aware of the multitude of issues caused by the program being able to switch threads
However, I never knew that compiler and hardware optimisations could reorder operations in a way that is guaranteed to work for a single thread, but not necessarily for multi-threading.
As neither C nor C++ have had a strongly defined memory model, compilers could reorder optimisations which might cause issues for multi-threading. But as for compilers which are designed for use in multi-threaded environments, they don't.
Multi-threaded code either writes to memory, and uses a fence to ensure visibility of the writes between threads, or it uses atomic operations.
Since the values used in the atomic operation case are observable in a single thread, the reordering does not effect it - they have to have been calculated correctly prior to the atomic operation.
Compliers intended for multi-threaded applications do not reorder across memory fences.
So the reordering either does not effect the behaviour, or is suppressed as a special case.
If you are already writing correct multi-threaded code, the compiler reordering doesn't matter. It's only an issue if the compiler isn't aware of memory fences, it which case you probably shouldn't be using it to write multi-threaded code in the first place.