Why is this inline assembly not working with a separate asm volatile statement for each instruction?

前端 未结 3 1319
没有蜡笔的小新
没有蜡笔的小新 2020-12-19 02:12

For the the following code:

long buf[64];

register long rrax asm (\"rax\");
register long rrbx asm (\"rbx\");
register long rrsi asm (\"rsi\");

rrax = 0x34         


        
3条回答
  •  攒了一身酷
    2020-12-19 02:24

    You clobber memory but don't tell GCC about it, so GCC can cache values in buf across assembly calls. If you want to use inputs and outputs, tell GCC about everything.

    __asm__ (
        "movq %1, 0(%0)\n\t"
        "movq %2, 8(%0)"
        :                                /* Outputs (none) */
        : "r"(buf), "r"(rrax), "r"(rrbx) /* Inputs */
        : "memory");                     /* Clobbered */
    

    You also generally want to let GCC handle most of the mov, register selection, etc -- even if you explicitly constrain the registers (rrax is stil %rax) let the information flow through GCC or you will get unexpected results.

    __volatile__ is wrong.

    The reason __volatile__ exists is so you can guarantee that the compiler places your code exactly where it is... which is a completely unnecessary guarantee for this code. It's necessary for implementing advanced features such as memory barriers, but almost completely worthless if you are only modifying memory and registers.

    GCC already knows that it can't move this assembly after printf because the printf call accesses buf, and buf could be clobbered by the assembly. GCC already knows that it can't move the assembly before rrax=0x39; because rax is an input to the assembly code. So what does __volatile__ get you? Nothing.

    If your code does not work without __volatile__ then there is an error in the code which should be fixed instead of just adding __volatile__ and hoping that makes everything better. The __volatile__ keyword is not magic and should not be treated as such.

    Alternative fix:

    Is __volatile__ necessary for your original code? No. Just mark the inputs and clobber values correctly.

    /* The "S" constraint means %rsi, "b" means %rbx, and "a" means %rax
       The inputs and clobbered values are specified.  There is no output
       so that section is blank.  */
    rsi = (long) buf;
    __asm__ ("movq %%rax, 0(%%rsi)" : : "a"(rrax), "S"(rssi) : "memory");
    __asm__ ("movq %%rbx, 0(%%rsi)" : : "b"(rrbx), "S"(rrsi) : "memory");
    

    Why __volatile__ doesn't help you here:

    rrax = 0x34; /* Dead code */
    

    GCC is well within its rights to completely delete the above line, since the code in the question above claims that it never uses rrax.

    A clearer example

    long global;
    void store_5(void)
    {
        register long rax asm ("rax");
        rax = 5;
        __asm__ __volatile__ ("movq %%rax, (global)");
    }
    

    The disassembly is more or less as you expect it at -O0,

    movl $5, %rax
    movq %rax, (global)
    

    But with optimization off, you can be fairly sloppy about assembly. Let's try -O2:

    movq %rax, (global)
    

    Whoops! Where did rax = 5; go? It's dead code, since %rax is never used in the function — at least as far as GCC knows. GCC doesn't peek inside assembly. What happens when we remove __volatile__?

    ; empty
    

    Well, you might think __volatile__ is doing you a service by keeping GCC from discarding your precious assembly, but it's just masking the fact that GCC thinks your assembly isn't doing anything. GCC thinks your assembly takes no inputs, produces no outputs, and clobbers no memory. You had better straighten it out:

    long global;
    void store_5(void)
    {
        register long rax asm ("rax");
        rax = 5;
        __asm__ __volatile__ ("movq %%rax, (global)" : : : "memory");
    }
    

    Now we get the following output:

    movq %rax, (global)
    

    Better. But if you tell GCC about the inputs, it will make sure that %rax is properly initialized first:

    long global;
    void store_5(void)
    {
        register long rax asm ("rax");
        rax = 5;
        __asm__ ("movq %%rax, (global)" : : "a"(rax) : "memory");
    }
    

    The output, with optimizations:

    movl $5, %eax
    movq %rax, (global)
    

    Correct! And we don't even need to use __volatile__.

    Why does __volatile__ exist?

    The primary correct use for __volatile__ is if your assembly code does something else besides input, output, or clobbering memory. Perhaps it messes with special registers which GCC doesn't know about, or affects IO. You see it a lot in the Linux kernel, but it's misused very often in user space.

    The __volatile__ keyword is very tempting because we C programmers often like to think we're almost programming in assembly language already. We're not. C compilers do a lot of data flow analysis — so you need to explain the data flow to the compiler for your assembly code. That way, the compiler can safely manipulate your chunk of assembly just like it manipulates the assembly that it generates.

    If you find yourself using __volatile__ a lot, as an alternative you could write an entire function or module in an assembly file.

提交回复
热议问题