In my application, there is a io-thread, that is dedicated for
To make the picture presented by the accepted answer complete following basic fact should be mentioned: both select() and pselect() may return EINTR as stated in their man pages:
EINTR A signal was caught; see signal(7).
This "caught" means that the signal should be recognized as "occurred during the system call execution":
1. If non-masked signal occurs during select/pselect execution then select/pselect will exit.
2. If non-masked signal occurs before select/pselect has been called this will not have any effect and select/pselect will continue waiting, potentially forever.
So if a signal occurs during select/pselect execution we are ok - the execution of select/pselect will be interrupted and then we can test the reason for the exit and discover that is was EINTR and then we can exit the loop.
The real threat that we face is a possibility of signal occurrence outside of select/pselect execution, then we may hang in the system call forever. Any attempt to discover this "outsider" signal by naive means:
if (was_a_signal) {
...
}
will fail since no matter how close this test will be to the call of select/pselect there is always a possibility that the signal will occur just after the test and before the call to select/pselect.
Then, if the only place to catch the signal is during select/pselect execution we should invent some kind of "wine funnel" so all "wine splashes" (signals), even outside of "bottle neck" (select/pselect execution period) will eventually come to the "bottle neck".
But how can you deceive system call and make it "think" that the signal has occurred during this system call execution when in reality it has occurred before?
Easy. Here is our "wine funnel": you just block the signal of interest and by that cause it (if it has occurred at all) waiting outside of the process "for the door to be opened" and you "open the door" (unmask the signal) only when you're prepared "to welcome the guest" (select/pselect is running). Then the "arrived" signal will be recognized as "just occurred" and will interrupt the execution of the system call.
Of course, "opening the door" is the most critical part of the plan - it cannot be done by the usual means (first unmask, then call to select/pselect), the only possibility is to do the both actions (unmask and system call) at once (atomically) - this is what pselect() is capable of but select() is not.
The accepted answer is not correct vis a vis difference between select and pselect. It does describe well how a race condition between sig-handler and select can arise, but it is incorrect in how it uses pselect to solve the problem. It misses the main point about pselect which is that it waits for EITHER the file-descriptor or the signal to become ready. pselect returns when either of these are ready.Select ONLY waits on the file-descriptor. Select ignores signals. See this blog post for a good working example: https://www.linuxprogrammingblog.com/code-examples/using-pselect-to-avoid-a-signal-race
Between (p)select and (p)poll is a rather subtle difference:
For select, you have to initialize and populate the ugly fd_set bitmaps everytime before you call select because select modifies them in-place in a "destructive" fashion. (poll distinguishes between the .events
and .revents
members in struct pollfd
).
After selecting, the entire bitmap is often scanned (by people/code) for events even if most of the fds are not even watched.
Third, the bitmap can only deal with fds whose number is less than a certain limit (contemporary implementations: somewhere between 1024..4096), which rules it out in programs where high fds can be easibly attained (notwithstanding that such programs are likely to already use epoll instead).
I'd suggest by starting the comparison with select()
vs poll()
. Linux also provides both pselect()
and ppoll()
; and the extra const sigset_t *
argument to pselect()
and ppoll()
(vs select()
and poll()
) has the same effect on each "p-variant", as it were. If you are not using signals, you have no race to protect against, so the base question is really about efficiency and ease of programming.
Meanwhile there's already a stackoverflow.com answer here: what are the differences between poll and select.
As for the race: once you start using signals (for whatever reason), you will learn that in general, a signal handler should just set a variable of type volatile sig_atomic_t
to indicate that the signal has been detected. The fundamental reason for this is that many library calls are not re-entrant, and a signal can be delivered while you're "in the middle of" such a routine. For instance, simply printing a message to a stream-style data structure such as stdout
(C) or cout
(C++) can lead to re-entrancy issues.
Suppose you have code that uses a volatile sig_atomic_t flag
variable, perhaps to catch SIGINT
, something like this (see also http://pubs.opengroup.org/onlinepubs/007904975/functions/sigaction.html):
volatile sig_atomic_t got_interrupted = 0;
void caught_signal(int unused) {
got_interrupted = 1;
}
...
struct sigaction sa;
sa.sa_handler = caught_signal;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
if (sigaction(SIGINT, &sa, NULL) == -1) ... handle error ...
...
Now, in the main body of your code, you might want to "run until interrupted":
while (!got_interrupted) {
... do some work ...
}
This is fine up until you start needing to make calls that wait for some input/output, such as select
or poll
. The "wait" action needs to wait for that I/O—but it also needs to wait for a SIGINT
interrupt. If you just write:
while (!got_interrupted) {
... do some work ...
result = select(...); /* or result = poll(...) */
}
then it's possible that the interrupt will happen just before you call select()
or poll()
, rather than afterward. In this case, you did get interrupted—and the variable got_interrupted
gets set—but after that, you start waiting. You should have checked the got_interrupted
variable before you started waiting, not after.
You can try writing:
while (!got_interrupted) {
... do some work ...
if (!got_interrupted)
result = select(...); /* or result = poll(...) */
}
This shrinks the "race window", because now you'll detect the interrupt if it happens while you're in the "do some work" code; but there is still a race, because the interrupt can happen right after you test the variable, but right before the select-or-poll.
The solution is to make the "test, then wait" sequence "atomic", using the signal-blocking properties of sigprocmask
(or, in POSIX threaded code, pthread_sigmask
):
sigset_t mask, omask;
...
while (!got_interrupted) {
... do some work ...
/* begin critical section, test got_interrupted atomically */
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
if (sigprocmask(SIG_BLOCK, &mask, &omask))
... handle error ...
if (got_interrupted) {
sigprocmask(SIG_SETMASK, &omask, NULL); /* restore old signal mask */
break;
}
result = pselect(..., &omask); /* or ppoll() etc */
sigprocmask(SIG_SETMASK, &omask, NULL);
/* end critical section */
}
(the above code is actually not that great, it's structured for illustration rather than efficiency -- it's more efficient to do the signal mask manipulation slightly differently, and place the "got interrupted" tests differently).
Until you actually start needing to catch SIGINT
, though, you need only compare select()
and poll()
(and if you start needing large numbers of descriptors, some of the event-based stuff like epoll()
is more efficient than either one).