inline-assembly | 易学教程

How to set gcc to use intel syntax permanently?

阅读更多关于 How to set gcc to use intel syntax permanently?

问题 I have the following code which compiles fine with the gcc command gcc ./example.c . The program itself calls the function "add_two" which simply adds two integers. To use the intel syntax within the extended assembly instructions I need to switch at first to intel and than back to AT&T. According to the gcc documentation it is possible to switch to intel syntax entirely by using gcc -masm=intel ./exmaple . Whenever I try to compile it with the switch -masm=intel it won't compile and I don't

clang (LLVM) inline assembly - multiple constraints with useless spills / reloads

阅读更多关于 clang (LLVM) inline assembly - multiple constraints with useless spills / reloads

问题 clang / gcc : Some inline assembly operands can be satisfied with multiple constraints, e.g., "rm" , when an operand can be satisfied with a register or memory location. As an example, the 64 x 64 = 128 bit multiply: __asm__ ("mulq %q3" : "=a" (rl), "=d" (rh) : "%0" (x), "rm" (y) : "cc") The generated code appears to choose a memory constraint for argument 3 , which would be fine if we were register starved, to avoid a spill. Obviously there's less register pressure on x86-64 than on IA32.

How do you detect the CPU architecture type during run-time with GCC and inline asm?

阅读更多关于 How do you detect the CPU architecture type during run-time with GCC and inline asm?

问题 I need to find the architecture type of a CPU. I do not have access to /proc/cpuinfo, as the machine is running syslinux. I know there is a way to do it with inline ASM, however I believe my syntax is incorrect as my variable iedx is not being set properly. I'm drudging along with ASM, and by no means an expert. If anyone has any tips or can point me in the right direction, I would be much obliged. static int is64Bit(void) { int iedx = 0; asm("mov %eax, 0x80000001"); asm("cpuid"); asm("mov %0

How can I indicate that the memory pointed to by an inline ASM argument may be used?

阅读更多关于 How can I indicate that the memory *pointed* to by an inline ASM argument may be used?

问题 Consider the following small function: void foo(int* iptr) { iptr[10] = 1; __asm__ volatile ("nop"::"r"(iptr):); iptr[10] = 2; } Using gcc, this compiles to: foo: nop mov DWORD PTR [rdi+40], 2 ret Note in particular, that the first write to iptr , iptr[10] = 1 doesn't occur at all: the inline asm nop is the first thing in the function, and only the final write of 2 appears (after the ASM call). Apparently the compiler decides that it only needs to provide an up-to-date version of the value of

Working inline assembly in C for bit parity?

阅读更多关于 Working inline assembly in C for bit parity?

问题 I'm trying to compute the bit parity of a large number of uint64's. By bit parity I mean a function that accepts a uint64 and outputs 0 if the number of set bits is even, and 1 otherwise. Currently I'm using the following function (by @Troyseph, found here): uint parity64(uint64 n){ n ^= n >> 1; n ^= n >> 2; n = (n & 0x1111111111111111) * 0x1111111111111111; return (n >> 60) & 1; } The same SO page has the following assembly routine (by @papadp): .code ; bool CheckParity(size_t Result)

What does the declaration“extern struct cpu *cpu asm(“%gs:0”);” mean？

阅读更多关于 What does the declaration“extern struct cpu *cpu asm(“%gs:0”);” mean？

问题 When I'm reading the xv6 source code, I'm confused about the syntax of the declaration below. Can anyone explain it to me? extern struct cpu *cpu asm("%gs:0"); 回答1: I assume you understand what extern struct cpu *cpu means. The question you have is: What does the asm("%gs:0") part mean? This code is using a gcc extension called asm labels to say that the variable cpu is defined by the assembler string %gs:0 . This is NOT how this extension is intended to be used and is considered a hack.

x86 convert to lower case assembly

阅读更多关于 x86 convert to lower case assembly

问题 This program is to convert a char pointer into lower case. I'm using Visual Studio 2010. This is from another question, but much simpler to read and more direct to the point. int b_search (char* token) { __asm { mov eax, 0 ; zero out the result mov edi, [token] ; move the token to search for into EDI MOV ecx, 0 LOWERCASE_TOKEN: ;lowercase the token OR [edi], 20h INC ecx CMP [edi+ecx],0 JNZ LOWERCASE_TOKEN MOV ecx, 0 At my OR instruction, where I'm trying to change the register that contains

Is Intel's timestamp reading asm code example using two more registers than are necessary?

阅读更多关于 Is Intel's timestamp reading asm code example using two more registers than are necessary?

问题 I'm looking into measuring benchmark performance using the time-stamp register (TSR) found in x86 CPUs. It's a useful register, since it measures in a monotonic unit of time which is immune to the clock speed changing. Very cool. Here is an Intel document showing asm snippets for reliably benchmarking using the TSR, including using cpuid for pipeline synchronisation. See page 16: http://www.intel.com/content/www/us/en/embedded/training/ia-32-ia-64-benchmark-code-execution-paper.html To read

Visual C++ x64 add with carry

阅读更多关于 Visual C++ x64 add with carry

问题 Since there doesn't seem to be an intrinsic for ADC and I can't use inline assembler for x64 architecture with Visual C++, what should I do if I want to write a function using add with carry but include it in a C++ namespace? (Emulating with comparison operators is not an option. This 256 megabit add is performance critical.) 回答1: There is now an instrinsic for ADC in MSVC: _addcarry_u64 . The following code #include <inttypes.h> #include <intrin.h> #include <stdio.h> typedef struct { uint64

Direct C function call using GCC's inline assembly

阅读更多关于 Direct C function call using GCC's inline assembly

问题 If you want to call a C/C++ function from inline assembly, you can do something like this: void callee() {} void caller() { asm("call *%0" : : "r"(callee)); } GCC will then emit code which looks like this: movl $callee, %eax call *%eax This can be problematic since the indirect call will destroy the pipeline on older CPUs. Since the address of callee is eventually a constant, one can imagine that it would be possible to use the i constraint. Quoting from the GCC online docs: `i' An immediate