Invalidating the CPU's cache

血红的双手。 提交于 2019-12-04 22:47:27

问题


When my program performs a load operation with acquire semantics/store operation with release semantics or perhaps a full-fence, it invalidates the CPU's cache.
My question is this: which part of the cache is actually invalidated? only the cache-line that held the variable that I've used acquire/release? or perhaps the entire cache is invalidated? (L1 + L2 + L3 .. and so on?). Is there a difference in this subject when I use acquire/release semantics, or when i use a full-fence?


回答1:


I'm not an expert on this, but I stumbled on this document, maybe it's helpful http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2009.04.05a.pdf




回答2:


When you perform a load without fences or mutexes, then the loaded value could potentially come from anywhere, i.e, caches, registers (by way of compiler optimizations), or RAM... but from your question, you already knew this.

In most mutex implementations, when you acquire a mutex, a fence is always applied, either explicitly (e.g., mfence, barrier, etc.) or implicitly (e.g., lock prefix to lock the bus on x86). This causes the cache-lines of all caches on the path to be invalidated.

Note that the entire cache isn't invalidated, just the respective cache-lines for the memory location. This also includes the lines for the mutex (which is usually implemented as a value in memory).

Of course, there are architecture-specific details, but this is how it works in general.

Also note that this isn't the only reason for invalidating caches, as there may be operations on one CPU that would need caches on another one to be invalidated. Doing a google search for "cache coherence protocols" will provide you with a lot of information on this subject.



来源:https://stackoverflow.com/questions/2213428/invalidating-the-cpus-cache

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!