Modern CPUs have quite a lot of performance counters - http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-pr
What about perf? perf list hw cache shows 33 different events and the man page shows how to use raw performance counter descriptors.
perf list hw cache