So, I had this code:
constexpr unsigned N = 1000;
void f1(char* sum, char* a, char* b) {
for(int i = 0; i < N; ++i) {
sum[i] = a[i] + b[i];
The interactions with optimisations are explained about halfway down the "Assembler Instructions with C Expression Operands" page in the documentation.
GCC doesn't try to understand any of the actual assembly inside the asm; the only thing it knows about the content is what you (optionally) tell it in the output and input operand specification and the register clobber list.
In particular, note:
An
asminstruction without any output operands will be treated identically to a volatileasminstruction.
and
The
volatilekeyword indicates that the instruction has important side-effects [...]
So the presence of the asm inside your loop has inhibited a vectorisation optimisation, because GCC assumes it has side effects.