clflush not flushing the instruction cache

一个人想着一个人 提交于 2019-12-03 03:56:06

Your code does almost nothing in func, and the little you do gets inlined into test, and probably optimized out since you never use the return value.

gcc -O3 gives me -

0000000000400620 <test>:
  400620:       53                      push   %rbx
  400621:       0f a2                   cpuid
  400623:       0f 31                   rdtsc
  400625:       48 89 d7                mov    %rdx,%rdi
  400628:       48 89 c6                mov    %rax,%rsi
  40062b:       0f a2                   cpuid
  40062d:       0f 31                   rdtsc
  40062f:       5b                      pop    %rbx
  ...

So you're measuring time for the two moves that are very cheap HW-wise - your measurement is probably showing the latency of cpuid which is relatively expensive..

Worse, your clflush would actually flush test as well, this means you pay the re-fetch penalty when you next access it, which is out of the rdtsc pair so it's not measured. The measured code on the other hand, sequentially follows, so fetching test would probably also fetch the flushed code you measure, so it could actually be cached by the time you measure it.

it works well on my computer.

264 ticks
Function must be cached by now!
258 ticks
Function flushed from cache.
519 ticks
Function must be cached again by now!
240 ticks
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!