int64_t pointer cast to AVX2 intrinsic _m256i

耗尽温柔 提交于 2019-12-10 10:18:00

问题


Hello I have a strange problem with AVX2 intrinsics. I create a pointer to a _m256i vector with a int64_t* cast. I then assign a value by dereferencing the pointer. The strange thing is that the value isn't observed in the vector variable, unless i run a few cout statements after it. The pointer and the vector have the same memory address and dereferencing the pointer produces the correct value, but the vector does not. What am I missing?

// Vector Variable 
__m256i R_A0to3 = _mm256_set1_epi32(0xFFFFFFFF);

int64_t *ptr = NULL;
for(int m=0; m<4; m++){
    // Cast pointer to vector type
    ptr = (int64_t*)&R_A0to3;

    cout<<"ptr_ADDRESS:      "<<ptr<<endl;
    cout<<"&R_A0to3_ADDRESS: "<<&R_A0to3<<endl;

    // access
    ptr[m] = (int64_t) m_array[m];

    // generic function that prints out register
    print_mm256_reg<int64_t>(R_A0to3, "R_A0to3");
    cout<<"m_array: "<< m_array[m]<<std::ends;

    // Additional print statements
    cout<<"ptr[m]: "<< ptr[m]<<std::endl;
    cout<<"ptr[0]: "<< ptr[0]<<std::endl;
    cout<<"ptr[1]: "<< ptr[1]<<std::endl;
    cout<<"ptr[2]: "<< ptr[2]<<std::endl;
    cout<<"ptr[3]: "<< ptr[3]<<std::endl;
    print_mm256_reg<int64_t>(R_A0to3, "R_A0to3");
}

Output:
 ptr_ADDRESS      0x7ffd9313e880
 &R_A0to3_ADDRESS 0x7ffd9313e880
 m_array: 8
 printing reg -    R_C0to3    -1|  -1|  -1|  -1|
 printing reg -    R_D0to3    -1|  -1|  -1|  -1|

Output with Additional print statements:
ptr_ADDRESS      0x7ffd36359e20
&R_A0to3_ADDRESS 0x7ffd36359e20
printing reg -    R_A0to3     -1|  -1|  -1|  -1|
m_array: 8

ptr[0]: 8
ptr[1]: -1
ptr[2]: -1
ptr[3]: -1
printing reg -    R_A0to3      8|  -1|  -1|  -1|

回答1:


I suggest using the _mm256_extract_epi64 and _mm256_insert_epi64 intrinsics when you need occasional access to individual elements. If you need to access all elements from the vector, consider using _mm256_store_si256 and _mm256_lddqu_si256 to store and load it. These intrinsics are less likely to rely on undefined behavior and they are transparent as to the machine instructions being generated (and thus as to the performance).



来源:https://stackoverflow.com/questions/38362528/int64-t-pointer-cast-to-avx2-intrinsic-m256i

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!