Does a memory barrier ensure that the cache coherence has been completed?

后端未结

关注

 4  2061

眼角桃花 2020-11-28 04:01

Say I have two threads that manipulate the global variable x. Each thread (or each core I suppose) will have a cached copy of x.

Now say th

4条回答

情话喂你 (楼主)

2020-11-28 04:52

No, a memory barrier does not ensure that cache coherence has been "completed". It often involves no coherence operation at all and can be performed speculatively or as a no-op.

It only enforces the ordering semantics described in the barrier. For example, an implementation might just put a marker in the store queue such that store-to-load forwarding doesn't occur for stores older than the marker.

Intel, in particular, already has a strong memory model for normal loads and stores (the kind that compilers generate and that you'd use in assembly) where the only possible re-ordering is later loads passing earlier stores. In the terminology of SPARC memory barriers, every barrier other than StoreLoad is already a no-op.

In practice, the interesting barriers on x86 are attached to LOCKed instructions, and the execution of such an instruction doesn't necessarily involve any cache coherence at all. If the line is already in an exclusive state, the CPU may simply execute the instruction, making sure not to release the exclusive state of the line while the operation is in progress (i.e., between the read of the argument and writeback of the result) and then only deal with preventing store-to-load forwarding from breaking the total ordering that LOCK instructions come with. Currently they do that by draining the store queue, but in future processors even that could be speculative.

What a memory barrier or barrier+op does is ensure that the operation is seen by other agents in a relative order that obeys all the restriction of the barrier. That certainly doesn't usually involve pushing the result to other CPUs as a coherence operation as you question implies.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...