rdtsc, too many cycles

后端 未结 5 950
梦毁少年i
梦毁少年i 2021-01-05 03:53
#include 
static inline unsigned long long tick() 
{
        unsigned long long d;
        __asm__ __volatile__ (\"rdtsc\" : \"=A\" (d) );
        ret         


        
5条回答
  •  半阙折子戏
    2021-01-05 04:33

    There are any number of reasons to get a large number:

    • The OS did a context switch, and your process got put to sleep.
    • A disk seek occurred, and your process got put to sleep.
    • …any of a slew of reasons as to why your process might get ignored.

    Note that rdtsc is not particularly reliable for timing without work, because:

    • Processor speeds can change, and thus, the length of a cycle (when measured in seconds) changes.
    • Different processors may have different values for the TSC for a given instant in time.

    Most operatings systems have a high-precision clock or timing method. clock_gettime on Linux for example, particularly the monotonic clocks. (Understand too the difference between a wall-clock and a monotonic clock: a wall clock can move backwards — even in UTC.) On Windows, I think the recommendation is QueryHighPerformanceCounter. Typically these clocks provide more than enough accuracy for most needs.


    Also, looking at the assembly, it looks like you're only getting 32-bits of the answer: I don't see %edx getting saved after rdtsc.


    Running your code, I get timings from 120-150 ns for clock_gettime using CLOCK_MONOTONIC, and 70-90 cycles for rdtsc (~20 ns at full speed, but I suspect the processor is clocked down, and that's really about 50 ns). (On a laptopdesktop (darn SSH, forgot which machine I was on!) that is at about a constant 20% CPU use) Sure your machine isn't bogged down?

提交回复
热议问题