问题
Considering the following assembly code loop:
#include <iostream>
#define ADD_LOOP(i, n, v) \
asm volatile ( \
"movw %1, %%cx ;" \
"movq %2, %%rax ;" \
"movq $0, %%rbx ;" \
"for: ;" \
"addq %%rax, %%rbx ;" \
"decw %%cx ;" \
"jnz for ;" \
"movq %%rbx, %0 ;" \
: "=x"(v) \
: "n"(i), "x"(n) \
: "%cx", "%rax", "%rbx" \
);
int main() {
uint16_t iter(10000);
uint64_t num(5);
uint64_t val;
ADD_LOOP(iter, num, val)
std::cout << val << std::endl;
return 0;
}
Is possible to call a C function (or it's machine code output) from within a loop as specified above?
for example:
#include <wmmintrin.h>
int main() {
__m128i x, y;
for(int i = 0; i < 10; i++) {
x = __builtin_ia32_aesenc128(x, y);
}
return 0;
}
Thanks
回答1:
No. Builtin functions aren't real functions that you can call with call
. They always inline when used in C / C++.
For example, if you want int __builtin_popcount (unsigned int x) to get either a popcnt
instruction for targets with -mpopcnt
, or a byte-wise lookup table for targets that don't support the popcnt
instruction, you are out of luck. You will have to #ifdef
yourself and use popcnt
or an alternative sequence of instructions.
The function you're talking about, __builtin_ia32_aesenc128
is just a wrapper for the aesenc assembly instruction which you can just use directly if writing in asm.
If you're writing asm instead of using C++ intrinsics (like #include <immintrin.h>
for performance, you need to have a look at http://agner.org/optimize/ to write more efficient asm (e.g. use %ecx
as a loop counter, not %cx
. You're gaining nothing from using a 16-bit partial register).
You could also write more efficient inline-asm constraints, e.g. the movq %%rbx, %0
is a waste of an instruction. You could have used %0
the whole time instead of an explict %rbx
. If your inline asm starts or ends with a mov instruction to copy to/from an output/input operand, usually you're doing it wrong. Let the compiler allocate registers for you. See the inline-assembly tag wiki.
Or better, https://gcc.gnu.org/wiki/DontUseInlineAsm. Code with intrinsics typically compiles well for x86. See Intel's intrinsics guide: #include <immintrin.h>
and use __m128i _mm_aesenc_si128 (__m128i a, __m128i RoundKey)
. (In gcc that's just a wrapper for __builtin_ia32_aesenc128
, but it makes your code portable to other x86 compilers.)
回答2:
Answer to your question may be split in two parts.
It is defenetly possible to call a C function from Assembly. To do so you need to follow a calling convention (which is described in ABI documents) which specifies how to pass arguments and get return values. Remember that you have registers, stack and memory to move data around.
Intrinsics however even, though they look like a C function are not functions. You may look at C as a somewhat high level assembly which works on a wide variety of architectures. In some cases you want to take an advantage of your specific architecture instruction set, hence compiler provides you with the way to do so via the means of intrinsics. Each intrinsic is mapped to some architecture specific assembly instructions. So in the end of the day you do not need to call them from assembly but rather need to find the instruction itself, for instance I expect __builtin_ia32_aesenc128
to be replaced with AESENC instruction.
来源:https://stackoverflow.com/questions/47927158/is-it-possible-to-call-a-built-in-function-from-assembly-in-c