How to use std::atomic efficiently

后端 未结 3 1469
别跟我提以往
别跟我提以往 2020-12-07 00:26

std::atomic is new feature introduced by c++11 but I can\'t find much tutorial on how to use it correctly. So are the following practice common and efficient?

One p

3条回答
  •  死守一世寂寞
    2020-12-07 00:54

    Your reinterpret_cast*>(...) is most definatly not the correct way to retrieve an atomic and not even guranteed to work. This is because std::atomic is not guaranteed to have the same size as T.

    To your second question about CAS being slower for bytes then machine words: That's really machine dependent, it might be faster, it might be slower, or there might not even exist CAS for bytes on your Target architecture. In the later case the implementation will most likely either need to use a locking implementation for the atomic or use a different (bigger) type internally (which is one example of atomics not having the same size as the underlying type).

    From what I see there is really no way to get an std::atomic on an existing value, particularly since they aren't guaranteed to be the same size. Therefore you really should directly make buf an std::atomic*. Furthermore I'm relatively sure that even if such a cast would work, access through non atomics to the same address wouldn't be guaranteed to work as expected (since this access isn't guaranteed to be atomic even for bytes). So having nonatomic means to access a memory location you want to do atomic operations on doesn't really make sense.

    Note that for common architectures stores and loads of bytes are atomic anyways, so you have little to no performance overhead for using atomics there, as long as you use relaxed memory order for those operations. So if you don't really care about order of execution at one point (e.g. because the program isn't multithreaded yet) simply use a.store(0, std::memory_order_relaxed) instead of a.store(0).

    Of course if you are only talking about x86 your reinterpret_cast is likely to work, but your performance question is probably still processor dependent (I think, I haven't looked up the actual instruction timings for cmpxchg).

提交回复
热议问题