问题
When I run this code in visual studio 2015, the code works correctly.But the code generates the following error in codeblocks : Segmentation fault(core dumped). I also ran the code in ubuntu with same error.
#include <iostream>
#include <immintrin.h>
struct INFO
{
unsigned int id = 0;
__m256i temp[8];
};
int main()
{
std::cout<<"Start AVX..."<<std::endl;
int _size = 100;
INFO *info = new INFO[_size];
for (int i = 0; i<_size; i++)
{
for (int k = 0; k < 8; k++)
{
info[i].temp[k] = _mm256_setr_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31);
}
}
std::cout<<"End AVX."<<std::endl;
return 0;
}
回答1:
The problem is that prior to C++17 new
and delete
did not respect the alignment of the to-be-allocated type. If you look at the generated assembly from this simple function:
INFO* new_test() {
int _size = 100;
INFO *info = new INFO[_size];
return info;
}
You'll see that when compiled with anything prior to C++17 operator new[](unsigned long)
is called, whereas for C++17 a call is made to operator new[](unsigned long, std::align_val_t)
(and 32
is passed for the second parameter).
Play around with it at godbolt.
If you can't use C++17, you can overwrite operator new[]
(and operator delete[]
-- and you should overwrite operator new
and operator delete
as well ...):
struct INFO {
unsigned int id = 0;
__m256i temp[8];
void* operator new[](size_t size) {
// part of C11:
return aligned_alloc(alignof(INFO), size);
}
void operator delete[](void* addr) {
free(addr); // aligned_alloc is compatible with free
}
};
This is part of the previous godbolt example, if you compile with -DOVERWRITE_OPERATOR_NEW
.
Note that this does not solve the alignment issue when using std::vector
(or any other std
-container), for that you need to pass an aligned allocator to the container (not part of the previous example).
回答2:
I found two ways to solve this problem
The first solution How to solve the 32-byte-alignment issue for AVX load/store operations?
struct INFO
{
__m256i temp[8];
unsigned int id = 0;
};
INFO *info = static_cast<INFO*>(_mm_malloc(sizeof(INFO)*_size, 32));
_mm_free(info);
The second solution
INFO *info = new INFO[_size];
for (int i = 0; i < _size; i++)
{
INFO new_info;
for (int k = 0; k < 8; k++)
{
new_info.temp[k] = _mm256_setr_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31);
}
info[i] = new_info;
}
来源:https://stackoverflow.com/questions/55566275/segmentation-fault-core-dumped-when-using-avx-on-an-array-allocated-with-new