I\'m doing some Linux Kernel timings, specifically in the Interrupt Handling path. I\'ve been using RDTSC for timings, however I recently learned it\'s not necessarily accur
The following code will ensure that rdstcp
kicks in at exactly the right time.
RDTSCP
cannot execute too early, but it can execute to late because the CPU can move instructions after rdtscp
to execute before it.
In order to prevent this we create a false dependency chain based on the fact that rdstcp
puts its output in edx:eax
rdtscp ;rdstcp is read serialized, it will not execute too early.
;also ensure it does not execute too late
mov r8,rdx ;rdtscp changes rdx and rax, force dependency chain on rdx
xor r8,rbx ;push rbx, do not allow push rbx to execute OoO
xor rbx,rdx ;rbx=r8
xor rbx,r8 ;rbx = 0
push rdx
push rax
mov rax,rbx ;rax = 0, but in a way that excludes OoO execution.
cpuid
pop rax
pop rdx
mov rbx,r8
xor rbx,rdx ;restore rbx
Note that even though this time is accurate up to a single cycle.
You still need to run your sample many many times and take the lowest time of those many runs in order to get the actual running time.