发表新帖

发表新帖

Enhanced REP MOVSB for memcpy

前端未结

关注

 6  1095

别跟我提以往 2020-11-22 02:04

I would like to use enhanced REP MOVSB (ERMSB) to get a high bandwidth for a custom memcpy.

ERMSB was introduced with the Ivy Bridge microarchitecture

6条回答

一向 (楼主)

2020-11-22 02:42

There are far more efficient ways to move data. These days, the implementation of memcpy will generate architecture specific code from the compiler that is optimized based upon the memory alignment of the data and other factors. This allows better use of non-temporal cache instructions and XMM and other registers in the x86 world.

When you hard-code rep movsb prevents this use of intrinsics.

Therefore, for something like a memcpy, unless you are writing something that will be tied to a very specific piece of hardware and unless you are going to take the time to write a highly optimized memcpy function in assembly (or using C level intrinsics), you are far better off allowing the compiler to figure it out for you.

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题