问题
I have the following part in my asm assembly code
"LOOP%=:\n\t"
"movapd (%%eax), %%xmm4\n\t"
"addl $32, %%eax\n\t"
"movsd (%%edx), %%xmm5\n\t"
"addl $16, %%edx\n\t"
"movapd %%xmm4, %%xmm6\n\t"
"subl $1, %%ecx\n\t"
"unpcklpd %%xmm5, %%xmm5\n\t"
"testl %%ecx, %%ecx\n\t"
"mulpd %%xmm5, %%xmm6\n\t"
"movsd -8(%%edx), %%xmm7\n\t"
"addpd %%xmm6, %%xmm0\n\t"
"movapd -16(%%eax), %%xmm6\n\t"
"unpcklpd %%xmm7, %%xmm7\n\t"
"mulpd %%xmm6, %%xmm5\n\t"
"addpd %%xmm5, %%xmm1\n\t"
"mulpd %%xmm7, %%xmm4\n\t"
"addpd %%xmm4, %%xmm2\n\t"
"mulpd %%xmm6, %%xmm7\n\t"
"addpd %%xmm7, %%xmm3\n\t"
"jne LOOP%=\n\t" */
This code holds in %ecx a "loop index", while scanning two (double *) arrays A and B performing some computation using SSE2. Both arrays have been aligned to 64Bytes (aligned to cache line so the 16Byte alignment requirement of SSE is satisfied). %eax holds a "pointer" to array A and "edx" holds a "pointer" to array B. It runs correctly and there is no memory read error. I am wondering why do I have to do
"movapd (%%eax), %%xmm4\n\t"
"addl $32, %%eax\n\t"
"movsd (%%edx), %%xmm5\n\t"
"addl $16, %%edx\n\t"
......
"movsd -8(%%edx), %%xmm7\n\t"
......
"movapd -16(%%eax), %%xmm6\n\t"
......
So I change the initial version to
"LOOP%=:\n\t"
"movapd (%%eax), %%xmm4\n\t"
"movsd (%%edx), %%xmm5\n\t"
"movapd %%xmm4, %%xmm6\n\t"
"subl $1, %%ecx\n\t"
"unpcklpd %%xmm5, %%xmm5\n\t"
"testl %%ecx, %%ecx\n\t"
"mulpd %%xmm5, %%xmm6\n\t"
"movsd 8(%%edx), %%xmm7\n\t"
"addl $16, %%edx\n\t"
"addpd %%xmm6, %%xmm0\n\t"
"movapd 16(%%eax), %%xmm6\n\t"
"addl $32, %%eax\n\t"
"unpcklpd %%xmm7, %%xmm7\n\t"
"mulpd %%xmm6, %%xmm5\n\t"
"addpd %%xmm5, %%xmm1\n\t"
"mulpd %%xmm7, %%xmm4\n\t"
"addpd %%xmm4, %%xmm2\n\t"
"mulpd %%xmm6, %%xmm7\n\t"
"addpd %%xmm7, %%xmm3\n\t"
"jne LOOP%=\n\t"
But then I suffer from a segmentation fault for invalid read.
It appears funny to me. Why?
回答1:
This is the cause:
"testl %%ecx, %%ecx\n\t"
The result of this test is used in the condition for the loop at the very end of this code. With move of add operations you ovewrite the flags for the condition so it's always satisfied and runs forever until leaving the memory.
来源:https://stackoverflow.com/questions/35207847/inline-assembly-in-c-funny-memory-segmentation-fault