#include static inline unsigned long long tick() { unsigned long long d; __asm__ __volatile__ (\"rdtsc\" : \"=A\" (d) ); ret
Just an idea - maybe these two rdtsc instructions are executed on different cores? rdtsc values may slightly vary across cores.