Optimization barrier for microbenchmarks in MSVC: tell the optimizer you clobber memory?

断了今生、忘了曾经 提交于 2019-11-30 17:13:23
Leon

Given your approximation of escape(), you should also be fine with the following approximation of clobber() (note that this is a draft idea, deferring some of the solution to the implementation of the function nextLocationToClobber()):

// always returns false, but in an undeducible way
bool isClobberingEnabled();

// The challenge is to implement this function in a way,
// that will make even the smartest optimizer believe that
// it can deliver a valid pointer pointing anywhere in the heap,
// stack or the static memory.
volatile char* nextLocationToClobber();

const bool clobberingIsEnabled = isClobberingEnabled();
volatile char* clobberingPtr;

inline void clobber() {
    if ( clobberingIsEnabled ) {
        // This will never be executed, but the compiler
        // cannot know about it.
        clobberingPtr = nextLocationToClobber();
        *clobberingPtr = *clobberingPtr;
    }
}

UPDATE

Question: How would you ensure that isClobberingEnabled returns false "in an undeducible way"? Certainly it would be trivial to place the definition in another translation unit, but the minute you enable LTCG, that strategy is defeated. What did you have in mind?

Answer: We can take advantage of a hard-to-prove property from the number theory, for example, Fermat's Last Theorem:

bool undeducible_false() {
    // It took mathematicians more than 3 centuries to prove Fermat's
    // last theorem in its most general form. Hardly that knowledge
    // has been put into compilers (or the compiler will try hard
    // enough to check all one million possible combinations below).

    // Caveat: avoid integer overflow (Fermat's theorem
    //         doesn't hold for modulo arithmetic)
    std::uint32_t a = std::clock() % 100 + 1;
    std::uint32_t b = std::rand() % 100 + 1;
    std::uint32_t c = reinterpret_cast<std::uintptr_t>(&a) % 100 + 1;

    return a*a*a + b*b*b == c*c*c;
}

I have used the following in place of escape.

#ifdef _MSC_VER
#pragma optimize("", off)
template <typename T>
inline void escape(T* p) {
    *reinterpret_cast<char volatile*>(p) =
        *reinterpret_cast<char const volatile*>(p); // thanks, @milleniumbug
}
#pragma optimize("", on)
#endif

It's not perfect but it's close enough, I think.

Sadly, I don't have a way to emulate clobber.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!