When compiling with gcc -O3, why does the following loop not vectorize (automatically):
#define SIZE (65536)
int a[SIZE], b[SIZE], c[SIZE];
in
GCC vectorizer is probably not smart enough to vectorize the first loop. The addition case is easier to vectorize because a + 0 == a. Consider SIZE==4:
0 1 2 3 i
0 X
1 X X
2 X X X
3 X X X X
j
X denotes the combinations of i and j when a will be assigned to or increased. For the case of addition, we can compute the results of b[i] > c[j] ? b[i] : c[j] for, say, j==1 and i==0..4 and put it into vector D. Then we only need to zero D[2..3] and add resulting vector to a[0..3]. For the case of assignment, it is a little more trickier. We must not only zero D[2..3], but also zero A[0..1] and only then combine the results. I guess this is where the vectorizer is failing.