Modern CPUs have quite a lot of performance counters - http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-pr
I think there is a available library that can be used, called perfmon2, http://perfmon2.sourceforge.net/, and documentations are available at http://www.hpl.hp.com/research/linux/perfmon/perfmon.php4 and http://www.hpl.hp.com/techreports/2004/HPL-2004-200R1.html, I am recently digging this lib out, I would post example code as soon as I figure it out~