问题
As the title reads, I am trying to use STL vector with SIMD intrinsic data type. I know it is not a good practice due to the potential overhead of load/store, but I encountered a quite weird fault. Here is the code:
#include "immintrin.h"
#include <vector>
#include <stdio.h>
#define VL 8
int main () {
    std::vector<__m256> vec_1(10);
    std::vector<__m256> vec_2(10);
    float * tmp_1 = new float[VL];
    printf("vec_1[0]:\n");
    _mm256_storeu_ps(tmp_1, vec_1[0]); // seems to go as expected
    for (int i = 0; i < VL; ++i)
        printf("%f ", tmp_1[i]);
    printf("\n");
    delete tmp_1;
    float * tmp_2 = new float[VL];
    printf("vec_2[0]:\n");
    _mm256_storeu_ps(tmp_2, vec_2[0]); // segmentation fault
    for (int i = 0; i < VL; ++i)
        printf("%f ", tmp_2[i]);
    printf("\n");
    delete tmp_2;
    return 0;
}
I compiled it using g++ -O3 -g -std=c++11 -mavx2 test.cpp -o test.  vec_1[0] is printed as expected (all zeros), but segmentation fault happens when it comes to vec_2[0]. I thought it was the alignment issue, but instead of _mm256_store_ps, I used _mm256_storeu_ps, which does not require alignment. 
It is a Intel Haswell architecture with AVX2 extension. GCC version is 4.8.5.
Any possible clue is welcome.
来源:https://stackoverflow.com/questions/39608172/using-stl-vector-with-simd-intrinsic-data-type