Flush iCache in x86

让人想犯罪 __ 提交于 2021-01-29 04:59:15

问题


Is there anyway I can flush iCache in x86 architecture ? Like WBINVD which will invalidate and flush all the cachelines in data cache.


回答1:


According to the docs, wbinvd flushes and invalidates all caches, not just data and unified caches. (I'm not sure if that includes TLBs if you ran it with paging enabled.)


What are you trying to test? L1i miss / L2 hit for code-fetch? I don't think it's possible to purposely flush just the I-cache without also flushing all levels of cache.

You could create conflict misses for a specific line by executing code at 8 addresses that alias it, assuming an 8-way 32kiB L1i cache. But cache replacement is usually pseudo-LRU, not true LRU, so you might want to jump through a set of more than 8 aliasing lines a couple times to make sure.


clflush / clflushopt should do the trick for a specific cache line. They're required to flush the line from all levels of cache in all cores.

I assume they would also evict decoded uops from the (virtually addressed) uop cache.

The CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables.


But if you want this correctness after JIT-compiling something, merely jumping or calling to the newly-written instructions is sufficient to avoid stale instruction fetch.

(In fact, on current x86 implementations, they snoop stores to any code address in the pipeline, so you'll never see stale instruction fetch even when you have the same physical page mapped to different virtual addresses, and write one while executing the other. Observing stale instruction fetching on x86 with self-modifying code)


You only need to worry about your compiler optimizing away "dead stores" to a buffer you cast to a function pointer. In GNU C / C++, use __builtin___clear_cache on the range of bytes you wrote. It compiles to zero instructions on x86 (unlike ARM or other ISAs with non-coherent instruction caches), but it is still needed to not optimize away stores of instruction bytes: How does __builtin___clear_cache work?



来源:https://stackoverflow.com/questions/7854283/flush-icache-in-x86

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!