CPU measures (Cache misses/hits) which do not make sense
问题 I use Intel PCM for fine-grained CPU measurements. In my code, I am trying to measure the cache efficiency. Basically, I first put a small array into the L1 cache (by traversing it many times), then I fire up the timer, go over the array one more time (which hopefully uses the cache), and then turning off the timer. PCM shows me that I have a rather high L2 and L3 miss ratio. I also checked with rdtscp and the cycles per array operation is 15 (which is much higher than 4-5 cycles for