I wish to write software which could essentially profile the CPU cache (L2,L3, possibly L1) and the memory, to analyze performance.
Am I right in thinking this is un
You might want to look at Intel's PMU i.e. Performance Monitoring Unit. Some processors have one. It is a bunch of special purpose registers (Intel calls them Model Specific Registers, or MSRs) which you can program to count events, like cache misses, using the RDMSR
and WRMSR
instructions.
Here is a document about Performance Analysis on i7 and Xeon 5500.
You might want to check out Intel's Performance Counter Monitor, which is basically some routines that abstract the PMU, which you can use in a C++ application to measure several performance metrics live, including cache misses. It also has some GUI/Commandline tools for standalone use.
Apparently, the Linux kernel has a facility for manipulating MSRs.
There are other utilities/APIs that also use the PMU: perf, PAPI.