Faster way to zero memory than with memset?

后端 未结 9 1475
刺人心
刺人心 2020-12-07 12:02

I learned that memset(ptr, 0, nbytes) is really fast, but is there a faster way (at least on x86)?

I assume that memset uses mov, however w

9条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-07 12:43

    x86 is rather broad range of devices.

    For totally generic x86 target, an assembly block with "rep movsd" could blast out zeros to memory 32-bits at time. Try to make sure the bulk of this work is DWORD aligned.

    For chips with mmx, an assembly loop with movq could hit 64bits at a time.

    You might be able to get a C/C++ compiler to use a 64-bit write with a pointer to a long long or _m64. Target must be 8 byte aligned for the best performance.

    for chips with sse, movaps is fast, but only if the address is 16 byte aligned, so use a movsb until aligned, and then complete your clear with a loop of movaps

    Win32 has "ZeroMemory()", but I forget if thats a macro to memset, or an actual 'good' implementation.

提交回复
热议问题