How do I align cv::Mat for AVX-512/AVX2 architectures?

房东的猫 提交于 2019-12-11 05:47:26

问题


Disclamer: I'm a simd newbie, so if this filthy peasant asks some bad questions.

From my understanding, AVX-512 architectures can process up to 16 float variables all together, while AVX2 "only" 4.

In order to take advantage of this, the data has to be aligned. As I found out here, this can be done with:

For AVX-512:

alignas(32) float a[8];

For AVX2:

alignas(16) float a[8];

Ok, so my first question is: since 16 is a factor of 32, why don't we always use alignas(32) also for AVX2 architectures? Maybe (probably) I'm missing something.

Then, I have this function:

bool interpolate(const Mat &im, Mat &res, /*...*/){/*...*/}

Which are allocated with:

cv::Mat im(r, c, CV_32FC1); //similarly res

The Intel compiler report tells me that these two matrices are not aligned. So my second question is: how can I allocate them so they are 16/32 aligned? I could allocate an aligned pointer and the pass it to cv::Mat constructor something like:

 float *aligned_ptr = /*allocate r*c 16/32 aligned floating points*/
 cv::Mat m (r, c, CV_32FC1, /* use aligned_ptr somehow*/);

来源:https://stackoverflow.com/questions/43718018/how-do-i-align-cvmat-for-avx-512-avx2-architectures

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!