How to access the system call from user-space?

牧云@^-^@ 提交于 2019-11-26 16:10:55

You first should understand what is the role of the linux kernel, and that applications interact with the kernel only thru system calls.

In effect, an application runs on the "virtual machine" provided by the kernel: it is running in the user space and can only do (at the lowest machine level) the set of machine instructions permitted in user CPU mode augmented by the instruction (e.g. SYSENTER or INT 0x80 ...) used to make system calls. So, from the user-level application point of view, a syscall is an atomic pseudo machine instruction.

The Linux Assembly Howto explains how a syscall can be done at the assembly (i.e. machine instruction) level.

The GNU libc is providing C functions corresponding to the syscalls. So for example the open function is a tiny glue (i.e. a wrapper) above the syscall of number NR__open (it is making the syscall then updating errno). Application usually call such C functions in libc instead of doing the syscall.

You could use some other libc. For instance the MUSL libc is somhow "simpler" and its code is perhaps easier to read. It also is wrapping the raw syscalls into corresponding C functions.

If you add your own syscall, you better also implement a similar C function (in your own library). So you should have also a header file for your library.

See also intro(2) and syscall(2) and syscalls(2) man pages, and the role of VDSO in syscalls.

Notice that syscalls are not C functions. They don't use the call stack (they could even be invoked without any stack). A syscall is basically a number like NR__open from <asm/unistd.h>, a SYSENTER machine instruction with conventions about which registers hold before the arguments to the syscall and which ones hold after the result[s] of the syscall (including the failure result, to set errno in the C library wrapping the syscall). The conventions for syscalls are not the calling conventions for C functions in the ABI spec (e.g. x86-64 psABI). So you need a C wrapper.

At first I would like to provide some definition of system call. System call is a process of synchronous explicit requesting of the particular kernel service from the user space application. Synchronous mean that the act of system call is predetermined by executing instructions sequence. Interrupts is an example of asynchronous system service request, because they arrive to the kernel absolutely independently from the code executing on processor. Exceptions in the contrast to system calls are synchronous but implicit requests for the kernel services.

System call consist from four stages:

  1. Passing control to the particular point in kernel with switching processor from user mode to kernel mode and returning it back with switching processor back to the user mode.
  2. Specifying of id of the requested kernel service.
  3. Passing of parameters for the requested service.
  4. Capturing the result of the service.

In general, all these actions can be implemented as a part of one big library function which makes a number of auxiliary actions before and/or after actual system call. In this case we can say that the system call is embedded in this function, but the function in general isn't a system call. In another case we can have a tiny function which makes only this four steps and nothing more. In this case we can say that this function is a system call. Actually you can implement the system call itself by manual implementation of all four stages mentioned above. Note, that in this case you will be forced to use Assembler, because all this steps are entirely architecture-dependent.

For example, Linux/i386 environment has next system call convention:

  1. Passing control from user mode to kernel mode can be done either by software interrupt with number 0x80 (assembly instruction INT 0x80) or by SYSCALL instruction (AMD) or by SYSENTER instruction (Intel)
  2. Id of the requested system service is specified by the integer value stored in EAX register during entering in the kernel mode. Kernel service id must be defined in the form _NR. You can find all system service ids in Linux source tree on path include\uapi\asm-generic\unistd.h.
  3. Up to 6 parameters can be passed through registers EBX(1) , ECX(2), EDX(3), ESI(4), EDI(5), EBP(6). The number in brackets is a sequential number of the parameter.
  4. Kernel returns the status of the service performed in the EAX register. This value usually used by glibc to setup errno variable.

In modern versions of Linux there is no any _syscall macro (as far I know). Instead, glibc library, that is a main interface library of the Linux kernel, provides a special macro - INTERNAL_SYSCALL, which expands into a small piece of code populated by inline assembler instructions. This piece of code is targeted to a particular hardware platform and implements all stages of system call, and due to this, this macro represents a system call itself. There is also another macro - INLINE_SYSCALL. The last one macro provides glibc-like error handling, in accordance to which on failed system call -1 will be returned and the error number will be stored in errno variable. Both macros are defined in sysdep.h of glibc package.

You can invoke a system call in the next way:

#include <sysdep.h>

#define __NR_<name> <id>

int my_syscall(void)
{
    return INLINE_SYSCALL(<name>, <argc>, <argv>);
}

where <name> must be replaced by the syscall name string, <id> - by the wanted system service number id, <argc> - by the actual number of parameters (from 0 to 6) and <argv> - by actual parameters separated by commas (and started by comma if parameters are present).

For example:

#include <sysdep.h>

#define __NR_exit 1

int _exit(int status)
{
    return INLINE_SYSCALL(exit, 1, status); // takes 1 parameter "status"
}

or another example:

#include <sysdep.h>

#define __NR_fork 2 

int _fork(void)
{
    return INLINE_SYSCALL(fork, 0); // takes no parameters
}

Minimal runnable assembly example

hello_world.asm:

section .rodata
    hello_world db "hello world", 10
    hello_world_len equ $ - hello_world
section .text
    global _start
    _start:
        mov eax, 4               ; syscall number: write
        mov ebx, 1               ; stdout
        mov ecx, hello_world     ; buffer
        mov edx, hello_world_len
        int 0x80                 ; make the call
        mov eax, 1               ; syscall number: exit
        mov ebx, 0               ; exit status
        int 0x80

Compile and run:

nasm -w+all -f elf32 -o hello_world.o hello_world.asm
ld -m elf_i386 -o hello_world hello_world.o
./hello_world

From the code, it is easy to deduce:

Of course, assembly will get tedious quickly, and you will soon want to use the C wrappers provided by the glibc / POSIX whenever you can, or the SYSCALL macro when you can't.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!