Automatic vectorization of matrix multiplication
问题 I'm fairly new with SIMD and wanted to try to see if I could get GCC to vectorise a simple action for me. So I looked at this post and wanted to do more or less the same thing. (but with gcc 5.4.0 on Linux 64bit, for a KabyLake processor) I essentially have this function: /* m1 = N x M matrix, m2 = M x P matrix, m3 = N x P matrix & output */ void mmul(double **m1, double **m2, double **m3, int N, int M, int P) { for (i = 0; i < N; i++) for (j = 0; j < P; j++) { double tmp = 0.0; for (k = 0; k