faster alternative to memcpy?

后端 未结 16 1183
一生所求
一生所求 2020-11-29 21:27

I have a function that is doing memcpy, but it\'s taking up an enormous amount of cycles. Is there a faster alternative/approach than using memcpy to move a piece of memory?

16条回答
  •  无人及你
    2020-11-29 21:31

    You should check the assembly code generated for your code. What you don't want is to have the memcpy call generate a call to the memcpy function in the standard library - what you want is to have a repeated call to the best ASM instruction to copy the largest amount of data - something like rep movsq.

    How can you achieve this? Well, the compiler optimizes calls to memcpy by replacing it with simple movs as long as it knows how much data it should copy. You can see this if you write a memcpy with a well determined (constexpr) value. If the compiler doesn't know the value, it will have to fall back to the byte-level implementation of memcpy - the issue being that memcpy has to respect the one-byte granularity. It will still move 128 bits at a time, but after each 128b it will have to check if it has enough data to copy as 128b or it has to fall back to 64bits, then to 32 and 8 (I think that 16 might be suboptimal anyway, but I don't know for sure).

    So what you want is either be able to tell to memcpy what's the size of your data with const expressions that the compiler can optimize. This way no call to memcpy is performed. What you don't want is to pass to memcpy a variable that will only be known at run-time. That translates into a function call and tons of tests to check the best copy instruction. Sometimes, a simple for loop is better than memcpy for this reason (eliminating one function call). And what you really really don't want is pass to memcpy an odd number of bytes to copy.

提交回复
热议问题