How to disassemble a binary executable in Linux to get the assembly code?

后端 未结 9 1348
情歌与酒
情歌与酒 2020-11-28 02:57

I was told to use a disassembler. Does gcc have anything built in? What is the easiest way to do this?

9条回答
  •  余生分开走
    2020-11-28 03:30

    This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdump and llvm-objdump.


    Agner Fog's disassembler, objconv, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).

    objconv  -fyasm a.out /dev/stdout | less
    

    (It doesn't recognize - as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asm tacked on.)

    It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.

    It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)

    It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.

    Sample output:

    ; Filling space: 0FH
    ; Filler type: Multi-byte NOP
    ;       db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH
    ;       db 1FH, 84H, 00H, 00H, 00H, 00H, 00H
    
    ALIGN   16
    
    foo:    ; Function begin
            cmp     rdi, 1                                  ; 00400620 _ 48: 83. FF, 01
            jbe     ?_026                                   ; 00400624 _ 0F 86, 00000084
            mov     r11d, 1                                 ; 0040062A _ 41: BB, 00000001
    ?_020:  mov     r8, r11                                 ; 00400630 _ 4D: 89. D8
            imul    r8, r11                                 ; 00400633 _ 4D: 0F AF. C3
            add     r8, rdi                                 ; 00400637 _ 49: 01. F8
            cmp     r8, 3                                   ; 0040063A _ 49: 83. F8, 03
            jbe     ?_029                                   ; 0040063E _ 0F 86, 00000097
            mov     esi, 1                                  ; 00400644 _ BE, 00000001
    ; Filling space: 7H
    ; Filler type: Multi-byte NOP
    ;       db 0FH, 1FH, 80H, 00H, 00H, 00H, 00H
    
    ALIGN   8
    ?_021:  add     rsi, rsi                                ; 00400650 _ 48: 01. F6
            mov     rax, rsi                                ; 00400653 _ 48: 89. F0
            imul    rax, rsi                                ; 00400656 _ 48: 0F AF. C6
            shl     rax, 2                                  ; 0040065A _ 48: C1. E0, 02
            cmp     r8, rax                                 ; 0040065E _ 49: 39. C0
            jnc     ?_021                                   ; 00400661 _ 73, ED
            lea     rcx, [rsi+rsi]                          ; 00400663 _ 48: 8D. 0C 36
    ...
    

    Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like

      (from /lib/x86_64-linux-gnu/libc.so.6)
    
    SECTION .plt    align=16 execute                        ; section number 11, code
    
    ?_00001:; Local function
            push    qword [rel ?_37996]                     ; 0001F420 _ FF. 35, 003A4BE2(rel)
            jmp     near [rel ?_37997]                      ; 0001F426 _ FF. 25, 003A4BE4(rel)
    
    ...    
    ALIGN   8
    ?_00002:jmp     near [rel ?_37998]                      ; 0001F430 _ FF. 25, 003A4BE2(rel)
    
    ; Note: Immediate operand could be made smaller by sign extension
            push    11                                      ; 0001F436 _ 68, 0000000B
    ; Note: Immediate operand could be made smaller by sign extension
            jmp     ?_00001                                 ; 0001F43B _ E9, FFFFFFE0
    

    doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.


    If you don't want to install it objconv, GNU binutils objdump -Mintel -d is very usable, and will already be installed if you have a normal Linux gcc setup.

提交回复
热议问题