JMP unexpected behavior in Shellcode when next(skipped) instruction is a variable definition

问题

Purpose: I was trying to take advantage of the RIP mode in x86-64. Even though the assembly performs as expected on its own, the shellcode does not.

The Problem: Concisely what I tried was this,

jmp l1
str1: db "some string"
l1:
   other code
   lea rax, [rel str1]

I used the above at various places, it failed only at certain places and succeeded in other places. I tried to play around and could not find any pattern when it fails. When variable(str1: db instruction) position is after the instruction accessing it, it never failed(in my observations). However, I want to remove nulls, hence I placed the variable definition before accessing it.

Debug finds
On debugging , I found the failed jmp point to some incorrect instruction address. Eg:(in gdb)

(code + 18) jmp [code +27] //jmp pointing incorrectly to in-between 2
(code + 22) ... (this part has label)
(code + 24) some instruction // this is where I intended the jmp
(code + 28) some other instruction

Code This is a sample code, I was trying to spawn a Execve Shell. It is quite large so I have identified the position of the culprit JMP.

global _start
section .text
_start: 
    xor rax,rax
    mov rsi,rax
    mov rdi,rsi
    mov rdx,rdi
    mov r8,rdx
    mov rcx,r8
    mov rbx,rcx
    jmp gg //failing (jumping somewhere unintended)
    p2: db "/bin/sh"        
gg:
    xor rax,rax
    lea rdi, [rel p2]
    mov [rdi+7], byte al //null terminating using 0x00 from rax
    mov [rdi+8], rdi
    mov [rdi+16],rax


    lea rsi,[rdi+8]
    lea rdx,[rdi+16]
    mov al,59
    syscall

EDIT:1 Have modified the code to contain the failing instructions

EDIT:2 Shellcode in C that I used.

#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05";
main()
{

    printf("Shellcode Length:  %d\n", (int)strlen(code));

    int (*ret)() = (int(*)())code;

    ret();

}

EDIT 3 I would get Hexdump by placing the following code would be placed inside a Bash file and running it by passing filename as argument. Took it from ShellStorm.

`for i in $(objdump -d $1 -M intel |grep "^ " |cut -f2); do echo -n '\x'$i`;

回答1:

TL;DR : The method you are using to convert your standalone shell code program shellExec to a shell code exploit string is buggy.

Based on the information given, I suspect the problem is the way in which you are using disassembly output to generate the final byte stream that gets converted into your shell code string. Likely the disassembly output had confusing output and possibly duplicated values. While trying to disassemble data (mixed with the code) it tried to output the shortest encodeable instruction to finish consuming all the data and then discovered you had a JMP target and duplicated some of the bytes as it backed up to re-synchronize. Whatever process was used to convert the disassembly to binary didn't take this kind of issue into account.

Don't use disassembly output to generate the binary file. Generate your standalone executable with the shell code (I believe shellExec is the file in your case) and use tools like OBJCOPY and HEXDUMP to generate the C shell code string:

objcopy -j.text -O binary execShell execShell.bin
hexdump -v -e '"\\""x" 1/1 "%02x" ""' execShell.bin

The objcopy command takes the execShell executable and extracts just the .text section (using the -j.text option) and outputs as binary data to the file execShell.bin. The hexdump command just reformats the binary file and outputs it in a form that can be used in a C string. This process doesn't involve parsing any confusing disassembly output so doesn't suffer the problem you encountered. The output of hexdump should look like:

\x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05

This differs slightly from yours which was:

\x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05

I've highlighted the difference. After the string of bytes /bin/sh your output introduced an extra \x48\x31 . The extra 2 bytes in your shell code string are responsible for the code not running as expected in the target executable.

回答2:

I could compile your code with nasm after changing the // with ;, I did not try to execute it, though.

nasm -f elf64 a.s -o a

Then I could look at the code with:

objdump -d a

And the part of the code you are asking about looks like (i.e. the jmp)

  12:   48 89 cb                mov    %rcx,%rbx
  15:   eb 07                   jmp    1e <gg>

0000000000000017 <p2>:
  17:   2f                      (bad)  
  18:   62                      (bad)  
  19:   69                      .byte 0x69
  1a:   6e                      outsb  %ds:(%rsi),(%dx)
  1b:   2f                      (bad)  
  1c:   73 68                   jae    86 <gg+0x68>

000000000000001e <gg>:
  1e:   48 31 c0                xor    %rax,%rax

However, the following has a few problems:

    mov [rdi+7], byte al ;null terminating using 0x00 from rax
    mov [rdi+8], rdi
    mov [rdi+16],rax

You are attempting to do a WRITE to READ-ONLY memory. Code can't be modified
The mov rdi/rax may not be aligned
If that code were to succeed, it would overwrite your code at gg:.

Also, it would be a good idea to put a by alignment just before gg: Something like this:

p2: "..."
align   16
gg:
   xor rax,rax

And so since the code is read only, you have to put the zeroes in there by hand.

p2: db "..."
    db 0
    db 0, 0, 0, 0, 0, 0, 0, 0
    db 0, 0, 0, 0, 0, 0, 0, 0

If you know it is aligned, dq will work too.

Note, however, that you do not align p2 either so you cannot be sure (i.e. if your code changes the alignment is very likely to change too.) You would probably want to do:

    align 16
p2: db "..."
    db 0
    dq 0
    dq 0

A final note, the [rel p2] was quite limited. The relative offset in those instructions were limited to -127 and +128 as far as I know. On my 64 bit processor, though, it uses a 32 bits offset. You may have been running in such a problem, depending on your assembler, where it decided that it was too far. In your case, one solution is to put the lea instruction before the jmp over the data. And I would imagine that the compiler should generate an error if the offset overflows. Another possibility would be that somehow something gets optimized and the jmp doesn't get updated properly.

As a side note, the following is not well optimized:

    xor rax,rax
    mov rsi,rax
    mov rdi,rsi
    mov rdx,rdi
    mov r8,rdx

Using the xor for all the registers or reusing rax each time instead of switching would work better (more likely to work in parallel). Right now you tie all the instructions to the previous one (i.e. before you can copy rsi in rdi, you need to copy rax to rsi. Having mov rdi,rax would remove that dependency.) I think that you would not see any difference in this case, but that's something to keep in mind for good optimization.

With the alignments and adding the zeroes in the code, I get:

0000000000000000 <_start>:
   0:   48 31 c0                xor    %rax,%rax
   3:   48 89 c6                mov    %rax,%rsi
   6:   48 89 f7                mov    %rsi,%rdi
   9:   48 89 fa                mov    %rdi,%rdx
   c:   49 89 d0                mov    %rdx,%r8
   f:   4c 89 c1                mov    %r8,%rcx
  12:   48 89 cb                mov    %rcx,%rbx
  15:   eb 29                   jmp    40 <gg>
  17:   90                      nop
  18:   90                      nop
  19:   90                      nop
  1a:   90                      nop
  1b:   90                      nop
  1c:   90                      nop
  1d:   90                      nop
  1e:   90                      nop
  1f:   90                      nop

0000000000000020 <p2>:
  20:   2f                      (bad)  
  21:   62                      (bad)  
  22:   69 6e 2f 73 68 00 00    imul   $0x6873,0x2f(%rsi),%ebp
        ...
  35:   00 00                   add    %al,(%rax)
  37:   00 90 90 90 90 90       add    %dl,-0x6f6f6f70(%rax)
  3d:   90                      nop
  3e:   90                      nop
  3f:   90                      nop

0000000000000040 <gg>:
  40:   48 31 c0                xor    %rax,%rax
  43:   48 8d 3d d6 ff ff ff    lea    -0x2a(%rip),%rdi        # 20 <p2>
  4a:   48 8d 77 08             lea    0x8(%rdi),%rsi
  4e:   48 8d 57 10             lea    0x10(%rdi),%rdx
  52:   b0 3b                   mov    $0x3b,%al
  54:   0f 05                   syscall

Note that when disassembling, if your data represents an instruction it may end up using bytes from your valid code and thus not show you what you would otherwise expect. i.e. it can end up not finding the destination label. Here we're good, though.

来源：https://stackoverflow.com/questions/47949930/jmp-unexpected-behavior-in-shellcode-when-nextskipped-instruction-is-a-variabl

标签

Linux

gcc

assembly

x86-64

shellcode