What would happen if a system executes a part of the file that is zero-padded?

前端 未结 1 1706
情深已故
情深已故 2020-12-20 07:31

I\'ve seen in some posts/videos/files that they are zero-padded to look bigger than they are, or match \"same file size\" criteria some file system utilities have for moving

相关标签:
1条回答
  • 2020-12-20 08:08

    The decoding of 0 bytes completely depends on the CPU architecture. On many architectures, instruction are fixed length (for example 32-bit), so the relevant thing would be 00 00 00 00 (using hexdump notation).

    On most Linux distros, clang/llvm comes with support for multiple target architectures built-in (clang -target and llvm-objdump), unlike gcc / gas / binutils, so I was able to use that to check for some architectures I didn't have cross-gcc / binutils installed for. Use llvm-objdump --version to see the supported list. (But I didn't figure out how to get it to disassemble a raw binary like binutils objdump -b binary, and my clang won't create SPARC binaries on its own.)


    On x86, 00 00 (2 bytes) decodes (http://ref.x86asm.net/coder32.html) as an 8-bit add with a memory destination. The first byte is the opcode, the 2nd byte is the ModR/M that specifies the operands.

    This usually segfaults right away (if eax/rax isn't a valid pointer), or segfaults once execution falls off the end of the zero-padded part into an unmapped page. (This happens in real life because of bugs like falling off the end of _start without making an exit system call), although in those cases the following bytes aren't always all zero. e.g. data, or ELF metadata.)


    x86 64-bit mode: ndisasm -b64 /dev/zero | head:

    address   machine code      disassembly
    00000000  0000              add [rax],al
    

    x86 32-bit mode (-b32):

    00000000  0000              add [eax],al
    

    x86 16-bit mode: (-b16):

    00000000  0000              add [bx+si],al
    

    AArch32 ARM mode: cd /tmp && dd if=/dev/zero of=zero bs=16 count=1 && arm-none-eabi-objdump -z -D -b binary -marm zero. (Without -z, objdump skips over large blocks of all-zero and shows ...)

    addr   machine code   disassembly
    0:   00000000        andeq   r0, r0, r0
    

    ARM Thumb/Thumb2: arm-none-eabi-objdump -z -D -b binary -marm --disassembler-options=force-thumb zero

    0:   0000            movs    r0, r0
    2:   0000            movs    r0, r0
    

    AArch64: aarch64-linux-gnu-objdump -z -D -b binary -maarch64 zero

     0:   00000000        .inst   0x00000000 ; undefined
    

    MIPS32: echo .long 0 > zero.S && clang -c -target mips zero.S && llvm-objdump -d zero.o

    zero.o: file format ELF32-mips
    Disassembly of section .text:
       0:       00 00 00 00     nop
    

    PowerPC 32 and 64-bit: -target powerpc and -target powerpc64. IDK if any extensions to PowerPC use the 00 00 00 00 instruction encoding for anything, or if it's still an illegal instruction on modern IBM POWER chips.

    zero.o: file format ELF32-ppc   (or ELF64-ppc64)
    Disassembly of section .text:
       0:       00 00 00 00  <unknown>
    

    IBM S390: clang -c -target systemz zero.S

    zero.o: file format ELF64-s390
    Disassembly of section .text:
       0:       00 00  <unknown>
       2:       00 00  <unknown>
    
    0 讨论(0)
提交回复
热议问题