How do unix signals work?

后端未结

关注

 4  777

星月不相逢 2020-12-13 10:34

How do signals work in unix? I went through W.R. Stevens but was unable to understand. Please help me.

4条回答

失恋的感觉 (楼主)

2020-12-13 11:11
Some issues that are not addressed in all of the above statements are multi core, running in kernel space while receiving a signal, sleeping in kernel space while receiving a signal, system call restarting and signal handler latency.

Here are a couple of issues to consider:
- What if the kernel knows that a signal needs to be delivered to process X which is running on CPU_X, but the kernel learns about it while running on CPU_Y (CPU_X!=CPU_Y). So the kernel needs to stop the process from running on a different core.
- What if the process is running in kernel space while receiving a signal? Every time a process makes a system call it enters kernel space and tinkers with data structures and memory allocations in kernel space. Does all of this hacking take place in kernel space too?
- What if the process is sleeping in kernel space waiting for some other event? (read, write, signal, poll, mutex are just some options).
Answers:
- If the process is running on another CPU the kernel, via cross CPU communication, will deliver an interrupt to the other CPU and a message for it. The other CPU will, in hardware, save state and jump to the kernel on the other CPU and then will do the delivery of the signal on the other CPU. This is all a part of trying not to execute the signal handler of the process on another CPU which will break cache locality.
- If the process is running in kernel space it is not interrupted. Instead it is recorded that this process has received a signal. When the process exits kernel space (at the end of each system call), the kernel will setup the trampoline to execute the signal handler.
- If the process, while running in kernel space, after having received a signal, reaches a sleep function, then that sleep function (and this is common to all sleep functions within the kernel) will check if the process has a signal pending. If it is so, it will not put the process to sleep and instead will cancel all that has been done while coming down into the kernel, and will exit to user space while setting up a trampoline to execute the signal handler and then restart the system call. You can actually control which signals you want to interrupt system calls and which you do not using the siginterrupt(2) system call. You can decide if you want system calls restartable for a certain signal when you register the signal using sigaction(2) with the SA_RESTART flag. If a system call is issued and is cut off by a signal and is not restarted automatically you will get an EINTR (interrupted) return value and you must handle that value. You can also look at the restart_syscall(2) system call for more details.
- If the process is already sleeping/waiting in kernel space (actually all sleeping/waiting is always in kernel space) it is woken from the sleep, kernel code cleans up after itself and jump to signal handler on return to user space after which the system call is automatically restarted if the user so desired (very similar to previous explanation of what happens if the process is running in kernel space).
A few notes about why all of this is so complex:
- You cannot just stop a process running in kernel space since the kernel developer allocates memory, does things to data structures and more. If you just take the control away you will corrupt the kernel state and cause a machine hang. The kernel code must be notified in a controlled way that it must stop its running, return to user space and allow user space to handle the signal. This is done via the return value of all (well, almost all) sleeping functions in the kernel. And kernel programmers are expected to treat those return values with respect and act accordingly.
- Signals are asynchronous. This means that they should be delivered as soon as possible. Imagine a process that has only one thread, went to sleep for hour, and is delivered a signal. Sleep is inside the kernel. So you except the kernel code to wake up, clean up after itself, return to user space and execute the signal handler, possibly restarting the system call after the signal handler finished. You certainly do not expect that process to only execute the signal handler an hour later. Then you expect the sleep to resume. Great trouble is taken by the user space and kernel people to allow just that.
- All in all signals are like interrupt handlers but for user space. This is a good analogy but not perfect. While interrupt handlers are generated by hardware some signal handlers originate from hardware but most are just software (signal about a child process dying, signal from another process using the kill(2) syscall and more).
So what is the latency of signal handling?
- If when you get a signal some other process is running then it up to the kernel scheduler to decide if to let the other process finish its time slice and only then deliver the signal or not. If you are on a regular Linux/Unix system this means that you could be delayed by 1 or more time slices before you get the signal (which means milliseconds which are equivalent to eternity).
- When you get a signal, if your process is high-priority or other processes already got their time slice you will get the signal quite fast. If you are running in user space you will get it "immediately", if you are running in kernel space you will shortly reach a sleep function or return from kernel in which case when you return to user space your signal handler will be called. That is usually a short time since not a lot of time is spent in the kernel.
- If you are sleeping in the kernel, and nothing else is above your priority or needs to run, the kernel thread handling your system call is woken up, cleans up after all the stuff it did on the way down into the kernel, goes back to user space and executes your signal. This doesn't take too long (were talking microseconds here).
- If you are running a real time version of Linux and your process has the highest real time priority then you will get the signal very soon after it is triggered. Were talking 50 microseconds or even better (depends on other factors that I cannot go into).
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...