What do the gcc assembly output labels signify?

情到浓时终转凉″ 提交于 2020-07-06 12:22:34

问题


I've written a simple C program test.c:

#include <stdio.h>
#include <stdlib.h>
int add(int a, int b);
int main()
{
    int i=5,j=10;
    int result;
    result = add(i, j);
    printf("result is %d\n", result);
}
int add(int a, int b)
{
    return (a + b);
}

and I compiled it:

gcc -S -Os -o test.s test.c 

and I get the assembly file test.s:

        .file   "test3.c"
    .section    .rodata
.LC0:
    .string "result is %d\n"
    .text
.globl main
    .type   main, @function
main:
.LFB5:
    pushq   %rbp
.LCFI0:
    movq    %rsp, %rbp
.LCFI1:
    subq    $16, %rsp
.LCFI2:
    movl    $5, -12(%rbp)
    movl    $10, -8(%rbp)
    movl    -8(%rbp), %esi
    movl    -12(%rbp), %edi
    call    add
    movl    %eax, -4(%rbp)
    movl    -4(%rbp), %esi
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    leave
    ret
.LFE5:
    .size   main, .-main
.globl add
    .type   add, @function
add:
.LFB6:
    pushq   %rbp
.LCFI3:
    movq    %rsp, %rbp
.LCFI4:
    movl    %edi, -4(%rbp)
    movl    %esi, -8(%rbp)
    movl    -8(%rbp), %eax
    addl    -4(%rbp), %eax
    leave
    ret
.LFE6:
    .size   add, .-add
    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long   .LECIE1-.LSCIE1
.LSCIE1:
    .long   0x0
    .byte   0x1
    .string "zR"
    .uleb128 0x1
    .sleb128 -8
    .byte   0x10
    .uleb128 0x1
    .byte   0x3
    .byte   0xc
    .uleb128 0x7
    .uleb128 0x8
    .byte   0x90
    .uleb128 0x1
    .align 8
.LECIE1:
.LSFDE1:
    .long   .LEFDE1-.LASFDE1
.LASFDE1:
    .long   .LASFDE1-.Lframe1
    .long   .LFB5
    .long   .LFE5-.LFB5
    .uleb128 0x0
    .byte   0x4
    .long   .LCFI0-.LFB5
    .byte   0xe
    .uleb128 0x10
    .byte   0x86
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI1-.LCFI0
    .byte   0xd
    .uleb128 0x6
    .align 8
.LEFDE1:
.LSFDE3:
    .long   .LEFDE3-.LASFDE3
.LASFDE3:
    .long   .LASFDE3-.Lframe1
    .long   .LFB6
    .long   .LFE6-.LFB6
    .uleb128 0x0
    .byte   0x4
    .long   .LCFI3-.LFB6
    .byte   0xe
    .uleb128 0x10
    .byte   0x86
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI4-.LCFI3
    .byte   0xd
    .uleb128 0x6
    .align 8
.LEFDE3:
    .ident  "GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-48)"
    .section    .note.GNU-stack,"",@progbits

I understand all these instructions, but I really don't understand what these labels mean. .LC0, .LFB5, .LCFI0, .LCFI1, .LCFI2, .LFE5, ... These labels are generated automatically by gcc. Why does it need these labels? It seems that some labels are redundant.

  • gcc version: 4.1.2
  • machine: x86_64

回答1:


The compiler will generate a label for any place it needs to refer to an address, whether it be for a jump or branch instruction, or for a data location.

The compiler has no need to create intuitively named labels since they are only referenced by code it generates and has no end-user visibility, so it generates more-or-less sequentially named labels, with a scheme to prevent accidentally creating the same label for two different locations.

There is absolutely no disadvantages to labelling the same location with two (or more) labels, so there is no attempt to avoid that. That is why there are a few locations with two sequential labels with no intervening ops.

If you really want to know what the, for example, LCx and LFBx series of labels mean, read the compiler source code. This is a non-trivial code base, so expect to spend hours just looking for the relevant module.


I rose to the challenge, so—having some compiler writing experience—I found module /trunk/gcc/dwarf2out.c which seems to generate label names using the same strategy. Look around line 250 for terse clues about what the symbols mean. Much of this module determines the labels, but it is nearly 23,000 lines long, so it could well test your curiosity.




回答2:


Try gcc -fverbose-asm -fdump-tree-all -S -Os -o test.s test.c to get much more informations, notably many "dump" files test.c.* containing a partial view of GCC internal representations.

Don't be bothered by apparently useless labels. I guess that GCC could generate one for each basic block.

Recall that GCC is working a lot on internal representations (Gimple, Tree) notably. Optimization passes (there are hundreds of them) are modifying these internal representations significantly. Most optimizations are in the middle-end, working on Gimple etc...

My slides on http://gcc-melt.org/ have a bit more detailed explanations (and you can find many others on the web).

Consider using MELT (a domain specific language to extend GCC 4.6 or later) to explore (or even modify) the internal GCC representations. MELT is very well suited for that goal.


NB:

your gcc-4.1 is several years old. GCC 4.7 has just been released (actually 4.7.0 second release candidate). And GCC made a lot of progress since 4.1 (appeared in 2006). You really should use newer versions (4.6 at least) if you care about optimizations. You can ask questions about GCC internals on gcc@gcc.gnu.org (lists for those developing or hacking the compiler), but most GCC contributors forgot the details of 4.1. Use gcc-help@gcc.gnu.org for general help about GCC (i.e. how to build or use it).



来源:https://stackoverflow.com/questions/9799676/what-do-the-gcc-assembly-output-labels-signify

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!