Why are CAS (Atomic) operations faster than synchronized or volatile operations

半城伤御伤魂 提交于 2019-12-22 08:34:56

问题


From what I understand, synchronized keyword syncs local thread cache with main memory. volatile keyword basically always reads the variable from the main memory at every access. Of course accessing main memory is much more expensive than local thread cache so these operations are expensive. However, a CAS operation use low level hardware operations but still has to access main memory. So how is a CAS operation any faster?


回答1:


I believe the critical factor is as you state - the CAS mechanisms use low-level hardware instructions that allow for the minimal cache flushing and contention resolution.

The other two mechanisms (synchronization and volatile) use different architectural tricks that are more generally available across all different architectures.

CAS instructions are available in one form or another in most modern architectures but there will be a different implementation in each architecture.

Interesting quote from Brian Goetz (supposedly)

The relative speed of the operations is largely a non-issue. What is relevant is the difference in scalability between lock-based and non-blocking algorithms. And if you're running on a 1 or 2 core system, stop thinking about such things.

Non-blocking algorithms generally scale better because they have shorter "critical sections" than lock-based algorithms.




回答2:


Note that a CAS does not necessarily have to access memory.

Most modern architectures implement a cache coherency protocol like MESI that allows the CPU to do shortcuts if there is only one thread accessing the data at the same time. The overhead compared to traditional, unsynchronized memory access is very low in this case.

When doing a lot of concurrent changes to the same value however, the caches are indeed quite worthless and all operations need to access main memory directly. In this case the overhead for synchronizing the different CPU caches and the serialization of memory access can lead to a significant performance drop (this is also known as cache ping-pong), which can be just as bad or even worse than what you experience with lock-based approaches.

So never simply assume that if you switch to atomics all your problems go away. The big advantage of atomics are the progress guarantees for lock-free (someone always makes progress) or wait-free (everyone finishes after a certain number of steps) implementations. However, this is often orthogonal to raw performance: A wait-free solution is likely to be significantly slower than a lock-based solution, but in some situations you are willing to accept that in order to get the progress guarantees.



来源:https://stackoverflow.com/questions/19623026/why-are-cas-atomic-operations-faster-than-synchronized-or-volatile-operations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!