Inspired by this question
How can I force GDB to disassemble?
and related to this one
What is INT 21h?
How does an actually system call happen under linux? what happens when the call is performed, until the actual kernel routine is invoked ?
Assuming we're talking about x86:
- The ID of the system call is deposited into the EAX register
- Any arguments required by the system call are deposited into the locations dictated by the system call. For example, some system calls expect their argument to reside in the EBX register. Others may expect their argument to be sitting on the top of the stack.
INT 0x80interrupt is invoked.
- The Linux kernel services the system call identified by the ID in the EAX register, depositing any results in pre-determined locations.
- The calling code makes use of any results.
I may be a bit rusty at this, it's been a few years...
The given answers are correct but I would like to add that there are more mechanisms to enter kernel mode. Every recent kernel maps the "vsyscall" page in every process' address space. It contains little more than the most efficient syscall trap method.
For example on a regular 32 bit system it could contain:
0xffffe000: int $0x80 0xffffe002: ret
But on my 64-bitsystem I have access to the way more efficient method using the syscall/sysenter instructions
0xffffe000: push %ecx 0xffffe001: push %edx 0xffffe002: push %ebp 0xffffe003: mov %esp,%ebp 0xffffe005: sysenter 0xffffe007: nop 0xffffe008: nop 0xffffe009: nop 0xffffe00a: nop 0xffffe00b: nop 0xffffe00c: nop 0xffffe00d: nop 0xffffe00e: jmp 0xffffe003 0xffffe010: pop %ebp 0xffffe011: pop %edx 0xffffe012: pop %ecx 0xffffe013: ret
This vsyscall page also maps some systemcalls that can be done without a context switch. I know certain gettimeofday, time and getcpu are mapped there, but I imagine getpid could fit in there just as well.
This is already answered at
How is the system call in Linux implemented?
Probably did not match with this question because of the differing "syscall" term usage.
Basically, its very simple: Somewhere in memory lies a table where each syscall number and the address of the corresponding handler is stored (see http://lxr.linux.no/linux+v2.6.30/arch/x86/kernel/syscall_table_32.S for the x86 version)
The INT 0x80 interrupt handler then just takes the arguments out of the registers, puts them on the (kernel) stack, and calls the appropriate syscall handler.