With Hyper Threading, threads of one physical core are exchanging via what level of cache L1/L2/L3?

天大地大妈咪最大 提交于 2019-12-01 20:51:04

The Intel Architecture Software Optimization manual has a brief description of how processor resources are shared between HT threads on a core in chapter 2.3.9. Documented for the Nehalem architecture, getting stale but fairly likely to still be relevant for current ones since the partitioning is logically consistent:

  • Duplicated for each HT thread: the registers, the return stack buffer, the large-page ITLB

  • Statically allocated for each HT thread: the load, store and re-order buffers, the small-page ITLB

  • Competitively shared between HT threads: the reservation station, the caches, the fill buffers, DTLB0 and STLB.

Your question matches the 3rd bullet. In the very specific case of each HT thread executing code from the same process, a bit of an accident, you can generally expect L1 and L2 to contain data retrieved by one HT thread that can be useful to the other. Keep in mind that the unit of storage in the caches is a cache-line, 64 bytes. Just in case: this is not otherwise a good reason to pursue a thread-scheduling approach that favors getting two HT threads to execute on the same core, assuming your OS would support that. An HT thread generally runs quite a bit slower than a thread that gets the core to itself. 30% is the usual number bandied about, YMMV.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!