问题
Hello I have a strange problem with AVX2 intrinsics. I create a pointer to a _m256i vector with a int64_t* cast. I then assign a value by dereferencing the pointer. The strange thing is that the value isn't observed in the vector variable, unless i run a few cout statements after it. The pointer and the vector have the same memory address and dereferencing the pointer produces the correct value, but the vector does not. What am I missing?
// Vector Variable
__m256i R_A0to3 = _mm256_set1_epi32(0xFFFFFFFF);
int64_t *ptr = NULL;
for(int m=0; m<4; m++){
// Cast pointer to vector type
ptr = (int64_t*)&R_A0to3;
cout<<"ptr_ADDRESS: "<<ptr<<endl;
cout<<"&R_A0to3_ADDRESS: "<<&R_A0to3<<endl;
// access
ptr[m] = (int64_t) m_array[m];
// generic function that prints out register
print_mm256_reg<int64_t>(R_A0to3, "R_A0to3");
cout<<"m_array: "<< m_array[m]<<std::ends;
// Additional print statements
cout<<"ptr[m]: "<< ptr[m]<<std::endl;
cout<<"ptr[0]: "<< ptr[0]<<std::endl;
cout<<"ptr[1]: "<< ptr[1]<<std::endl;
cout<<"ptr[2]: "<< ptr[2]<<std::endl;
cout<<"ptr[3]: "<< ptr[3]<<std::endl;
print_mm256_reg<int64_t>(R_A0to3, "R_A0to3");
}
Output:
ptr_ADDRESS 0x7ffd9313e880
&R_A0to3_ADDRESS 0x7ffd9313e880
m_array: 8
printing reg - R_C0to3 -1| -1| -1| -1|
printing reg - R_D0to3 -1| -1| -1| -1|
Output with Additional print statements:
ptr_ADDRESS 0x7ffd36359e20
&R_A0to3_ADDRESS 0x7ffd36359e20
printing reg - R_A0to3 -1| -1| -1| -1|
m_array: 8
ptr[0]: 8
ptr[1]: -1
ptr[2]: -1
ptr[3]: -1
printing reg - R_A0to3 8| -1| -1| -1|
回答1:
I suggest using the _mm256_extract_epi64
and _mm256_insert_epi64
intrinsics when you need occasional access to individual elements. If you need to access all elements from the vector, consider using _mm256_store_si256
and _mm256_lddqu_si256
to store and load it. These intrinsics are less likely to rely on undefined behavior and they are transparent as to the machine instructions being generated (and thus as to the performance).
来源:https://stackoverflow.com/questions/38362528/int64-t-pointer-cast-to-avx2-intrinsic-m256i