How can we implement the system call using sysenter/syscall directly in x86 Linux? Can anybody provide help? It would be even better if you can also show the code for amd64 platform.
I know in x86, we can use
__asm__(
" movl $1, %eax \n"
" movl $0, %ebx \n"
" call *%gs:0x10 \n"
);
to route to sysenter indirectly.
But how can we code using sysenter/syscall directly to issue a system call?
I find some material http://damocles.blogbus.com/tag/sysenter/ . But still find it difficult to figure out.
I'm going to show you how to execute system calls by writing a program that writes Hello World!
to standard output by using the write()
system call. Here's the source of the program without an implementation of the actual system call :
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size);
int main(void)
{
const char hello[] = "Hello world!\n";
my_write(1, hello, sizeof(hello));
return 0;
}
You can see that I named my custom system call function as my_write
in order to avoid name clashes with the "normal" write
, provided by libc. The rest of this answer contains the source of my_write
for i386 and amd64.
i386
System calls in i386 Linux are implemented using the 128th interrupt vector, e.g. by calling int 0x80
in your assembly code, having set the parameters accordingly beforehand, of course. It is possible to do the same via SYSENTER
, but actually executing this instruction is achieved by the VDSO virtually mapped to each running process. Since SYSENTER
was never meant as a direct replacement of the int 0x80
API, it's never directly executed by userland applications - instead, when an application needs to access some kernel code, it calls the virtually mapped routine in the VDSO (that's what the call *%gs:0x10
in your code is for), which contains all the code supporting the SYSENTER
instruction. There's quite a lot of it because of how the instruction actually works.
If you want to read more about this, have a look at this link. It contains a fairly brief overview of the techniques applied in the kernel and the VDSO.
#define __NR_write 4
ssize_t my_write(int fd, const void *buf, size_t size)
{
ssize_t ret;
asm volatile
(
"int $0x80"
: "=a" (ret)
: "0"(__NR_write), "b"(fd), "c"(buf), "d"(size)
: "cc", "edi", "esi", "memory"
);
return ret;
}
As you can see, using the int 0x80
API is relatively simple. The number of the syscall goes to the eax
register, while all the parameters needed for the syscall go into respectively ebx
, ecx
, edx
, esi
, edi
, and ebp
. System call numbers can be obtained by reading the file /usr/include/asm/unistd_32.h
. Prototypes and descriptions of the functions are available in the 2nd section of the manual, so in this case write(2)
. Since the kernel is allowed to destroy practically any of the registers, I put all the remaining GPRs on the clobber list, as well as cc
, since the eflags
register is also likely to change. Keep in mind that the clobber list also contains the memory
parameter, which means that the instruction listed in the instruction list references memory (via the buf
parameter).
amd64
Things look very different on the AMD64 architecture, which sports a new instruction called SYSCALL
. It is very different from the original SYSENTER
instruction, and definitely much easier to use from userland applications - it really resembles a normal CALL
, actually, and adapting the old int 0x80
to the new SYSCALL
is pretty much trivial.
In this case, the number of the system call is still passed in the register rax
, but the registers used to hold the arguments have severely changed, since now they should be used in the following order : rdi
, rsi
, rdx
, r10
, r8
and r9
. The kernel is allowed to destroy content of registers rcx
and r11
(they're used for saving some of the other registers by SYSCALL
).
#define __NR_write 1
ssize_t my_write(int fd, const void *buf, size_t size)
{
ssize_t ret;
asm volatile
(
"syscall"
: "=a" (ret)
: "0"(__NR_write), "D"(fd), "S"(buf), "d"(size)
: "cc", "rcx", "r11", "memory"
);
return ret;
}
Do notice how practically the only thing that needed changing were the register names, and the actual instruction used for making the call. This is mostly thanks to the input/output lists provided by gcc's extended inline assembly syntax, which automagically provides appropriate move instructions needed for executing the instruction list.
Explicit register variables
Just for completeness, I want to provide an example using GCC explicit register variables.
This mechanism has the following advantages:
- it can represent all registers, including
r8
,r9
andr10
which are used for system call arguments: How to specify register constraints on the Intel x86_64 register r8 to r15 in GCC inline assembly? - I'll argue that this syntax is more readable than using the single letter mnemonics such as
S -> rsi
Register variables are used for example in glibc 2.29, see: sysdeps/unix/sysv/linux/x86_64/sysdep.h
.
Also note that other archs such as ARM have dropped the single letter mnemonics completely, and register variables are the only way to do it it seems, see for example: How to specify an individual register as constraint in ARM GCC inline assembly?
main_reg.c
#define _XOPEN_SOURCE 700
#include <inttypes.h>
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size) {
register int64_t rax __asm__ ("rax") = 1;
register int rdi __asm__ ("rdi") = fd;
register const void *rsi __asm__ ("rsi") = buf;
register size_t rdx __asm__ ("rdx") = size;
__asm__ __volatile__ (
"syscall"
: "+r" (rax)
: "r" (rdi), "r" (rsi), "r" (rdx)
: "cc", "rcx", "r11", "memory"
);
return rax;
}
void my_exit(int exit_status) {
register int64_t rax __asm__ ("rax") = 60;
register int rdi __asm__ ("rdi") = exit_status;
__asm__ __volatile__ (
"syscall"
: "+r" (rax)
: "r" (rdi)
: "cc", "rcx", "r11", "memory"
);
}
void _start(void) {
char msg[] = "hello world\n";
my_exit(my_write(1, msg, sizeof(msg)) != sizeof(msg));
}
Compile and run:
gcc -O3 -std=c99 -ggdb3 -ffreestanding -nostdlib -Wall -Werror \
-pedantic -o main_reg.out main_reg.c
./main.out
echo $?
Output
hello world
0
For comparison, the following analogous to How to invoke a system call via sysenter in inline assembly? produces equivalent assembly:
main_constraint.c
#define _XOPEN_SOURCE 700
#include <inttypes.h>
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size) {
ssize_t ret;
__asm__ __volatile__ (
"syscall"
: "=a" (ret)
: "0" (1), "D" (fd), "S" (buf), "d" (size)
: "cc", "rcx", "r11", "memory"
);
return ret;
}
void my_exit(int exit_status) {
ssize_t ret;
__asm__ __volatile__ (
"syscall"
: "=a" (ret)
: "0" (60), "D" (exit_status)
: "cc", "rcx", "r11", "memory"
);
}
void _start(void) {
char msg[] = "hello world\n";
my_exit(my_write(1, msg, sizeof(msg)) != sizeof(msg));
}
Disassembly of both with:
objdump -d main_reg.out
is almost identical, here is the main_reg.c
one:
Disassembly of section .text:
0000000000001000 <my_write>:
1000: b8 01 00 00 00 mov $0x1,%eax
1005: 0f 05 syscall
1007: c3 retq
1008: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
100f: 00
0000000000001010 <my_exit>:
1010: b8 3c 00 00 00 mov $0x3c,%eax
1015: 0f 05 syscall
1017: c3 retq
1018: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
101f: 00
0000000000001020 <_start>:
1020: c6 44 24 ff 00 movb $0x0,-0x1(%rsp)
1025: bf 01 00 00 00 mov $0x1,%edi
102a: 48 8d 74 24 f3 lea -0xd(%rsp),%rsi
102f: 48 b8 68 65 6c 6c 6f movabs $0x6f77206f6c6c6568,%rax
1036: 20 77 6f
1039: 48 89 44 24 f3 mov %rax,-0xd(%rsp)
103e: ba 0d 00 00 00 mov $0xd,%edx
1043: b8 01 00 00 00 mov $0x1,%eax
1048: c7 44 24 fb 72 6c 64 movl $0xa646c72,-0x5(%rsp)
104f: 0a
1050: 0f 05 syscall
1052: 31 ff xor %edi,%edi
1054: 48 83 f8 0d cmp $0xd,%rax
1058: b8 3c 00 00 00 mov $0x3c,%eax
105d: 40 0f 95 c7 setne %dil
1061: 0f 05 syscall
1063: c3 retq
So we see that GCC inlined those tiny syscall functions as would be desired.
my_write
and my_exit
are the same for both, but _start
in main_constraint.c
is slightly different:
0000000000001020 <_start>:
1020: c6 44 24 ff 00 movb $0x0,-0x1(%rsp)
1025: 48 8d 74 24 f3 lea -0xd(%rsp),%rsi
102a: ba 0d 00 00 00 mov $0xd,%edx
102f: 48 b8 68 65 6c 6c 6f movabs $0x6f77206f6c6c6568,%rax
1036: 20 77 6f
1039: 48 89 44 24 f3 mov %rax,-0xd(%rsp)
103e: b8 01 00 00 00 mov $0x1,%eax
1043: c7 44 24 fb 72 6c 64 movl $0xa646c72,-0x5(%rsp)
104a: 0a
104b: 89 c7 mov %eax,%edi
104d: 0f 05 syscall
104f: 31 ff xor %edi,%edi
1051: 48 83 f8 0d cmp $0xd,%rax
1055: b8 3c 00 00 00 mov $0x3c,%eax
105a: 40 0f 95 c7 setne %dil
105e: 0f 05 syscall
1060: c3 retq
It is interesting to observe that in this case GCC found a slightly shorter equivalent encoding by picking:
104b: 89 c7 mov %eax,%edi
to set the fd
to 1
, which equals the 1
from the syscall number, rather than a more direct:
1025: bf 01 00 00 00 mov $0x1,%edi
For an in-depth discussion of the calling conventions, see also: What are the calling conventions for UNIX & Linux system calls on i386 and x86-64
Tested in Ubuntu 18.10, GCC 8.2.0.
来源:https://stackoverflow.com/questions/9506353/how-to-invoke-a-system-call-via-sysenter-in-inline-assembly