Running perf stat ls
shows this:
Performance counter stats for \'ls\':
1.388670 task-clock # 0.067 CPUs utilized
Just found Re: perf, x86: Add parts of the remaining haswell PMU functionality:
> AFAICS backend stall cycles are documented to work on Ivy Bridge.
I'm not aware of any documentation that presents these events
as accurate frontend/backend stalls without using the full
TopDown methology (Optimization manual B.3.2)
So IIUC stalled-cycles-backend counters are too unreliable on Ivy Bridge, and that's why the kernel devs have decided to not support them.
And sure enough, Linux' perf_event_intel.c supports PERF_COUNT_HW_STALLED_CYCLES_BACKEND
for Nehalem, Xeon E7 and SandyBridge, but not for IvyBridge. PERF_COUNT_HW_STALLED_CYCLES_FRONTEND
is supported for IvyBridge, though.
So I guess there won't be a way to get this counter on my current CPU - either switch CPUs or use the full top-down methodology mentioned in the mail (and described here and here)