“cpuid” before “rdtsc”

前端 未结 3 856
故里飘歌
故里飘歌 2020-12-16 12:39

Sometimes I encounter code that reads TSC with rdtsc instruction, but calls cpuid right before.

Why is calling cpuid necessary

3条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-16 13:11

    CPUID is serializing, preventing out-of-order execution of RDTSC.

    These days you can safely use LFENCE instead. It's documented as serializing on the instruction stream (but not stores to memory) on Intel CPUs, and now also on AMD after their microcode update for Spectre.

    https://hadibrais.wordpress.com/2018/05/14/the-significance-of-the-x86-lfence-instruction/ explains more about LFENCE.

    See also https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf for a way to use RDTSCP that keeps CPUID (or LFENCE) out of the timed region:

    LFENCE     ; (or CPUID) Don't start the timed region until everything above has executed
    RDTSC           ; EDX:EAX = timestamp
    mov  ebx, eax   ; low 32 bits of start time
    
       code under test
    
    RDTSCP     ; built-in one way barrier stops it from running early
    LFENCE     ; (or CPUID) still use a barrier after to prevent anything weird
    sub  eax, ebx   ; low 32 bits of end-start
    

    See also Get CPU cycle count? for more about RDTSC caveats, like constant_tsc and nonstop_tsc.

    As a bonus, RDTSCP gives you a core ID. You could use RDTSCP for the start time as well, if you want to check for core migration. But if your CPU has the constant_tsc features, all cores in the package should have their TSCs synced so you typically don't need this on modern x86.

    You could get the core ID from CPUID instead, as @Tony's answer points out.

提交回复
热议问题