How to use VC++ intrinsic functions w/o run-time library

后端 未结 6 494
夕颜
夕颜 2020-12-13 02:44

I\'m involved in one of those challenges where you try to produce the smallest possible binary, so I\'m building my program without the C or C++ run-time libraries

相关标签:
6条回答
  • 2020-12-13 03:03

    Just name the function something slightly different.

    0 讨论(0)
  • 2020-12-13 03:10

    I think I finally found a solution:

    First, in a header file, declare memset() with a pragma, like so:

    extern "C" void * __cdecl memset(void *, int, size_t);
    #pragma intrinsic(memset)
    

    That allows your code to call memset(). In most cases, the compiler will inline the intrinsic version.

    Second, in a separate implementation file, provide an implementation. The trick to preventing the compiler from complaining about re-defining an intrinsic function is to use another pragma first. Like this:

    #pragma function(memset)
    void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
        unsigned char *p = static_cast<unsigned char *>(pTarget);
        while (cbTarget-- > 0) {
            *p++ = static_cast<unsigned char>(value);
        }
        return pTarget;
    }
    

    This provides an implementation for those cases where the optimizer decides not to use the intrinsic version.

    The outstanding drawback is that you have to disable whole-program optimization (/GL and /LTCG). I'm not sure why. If someone finds a way to do this without disabling global optimization, please chime in.

    0 讨论(0)
  • 2020-12-13 03:10

    I think you have to set Optimization to "Minimize Size (/O1)" or "Disabled (/Od)" to get the Release configuration to compile; at least this is what did the trick for me with VS 2005. Intrinsics are designed for speed so it makes sense that they would be enabled for the other Optimization levels (Speed and Full).

    0 讨论(0)
  • 2020-12-13 03:18
    1. I'm pretty sure there's a compiler flag that tells VC++ not to use intrinsics

    2. The source to the runtime library is installed with the compiler. You do have the choice of excerpting functions you want/need, though often you'll have to modify them extensively (because they include features and/or dependencies you don't want/need).

    3. There are other open source runtime libraries available as well, which might need less customization.

    4. If you're really serious about this, you'll need to know (and maybe use) assembly language.

    Edited to add:

    I got your new test code to compile and link. These are the relevant settings:

    Enable Intrinsic Functions: No
    Whole Program Optimization: No
    

    It's that last one that suppresses "compiler helpers" like the built-in memset.

    Edited to add:

    Now that it's decoupled, you can copy the asm code from memset.asm into your program--it has one global reference, but you can remove that. It's big enough so that it's not inlined, though if you remove all the tricks it uses to gain speed you might be able to make it small enough for that.

    I took your above example and replaced the memset() with this:

    void * __cdecl memset(void *pTarget, char value, size_t cbTarget) {
        _asm {
        push ecx
        push edi
    
        mov al, value
        mov ecx, cbTarget
        mov edi, pTarget
        rep stosb
    
        pop edi
        pop ecx
        }
        return pTarget;
    }
    

    It works, but the library's version is much faster.

    0 讨论(0)
  • 2020-12-13 03:24

    This definitely works with VS 2015: Add the command line option /Oi-. This works because "No" on Intrinsic functions isn't a switch, it's unspecified. /Oi- and all your problems go away (it should work with whole program optimization, but I haven't properly tested this).

    0 讨论(0)
  • 2020-12-13 03:25

    The way the "regular" runtime library does this is by compiling an assembly file with a definition of memset and linking it into the runtime library (You can find the assembly file in or around C:\Program Files\Microsoft Visual Studio 10.0\VC\crt\src\intel\memset.asm). That kind of thing works fine even with whole-program optimization.

    Also note that the compiler will only use the memset intrinsic in some special cases (when the size is constant and small?). It will usually use the memset function provided by you, so you should probably use the optimized function in memset.asm, unless you're going to write something just as optimized.

    0 讨论(0)
提交回复
热议问题