问题
I am curious how epoll_wait() receives the event that a registered socket (with epoll_ctl()) is ready for read/write.
I believe that glibc magically handles it.
Then, is there a document describing how the following events can be triggered for a socket?
- EPOLLPRI
- EPOLLRDNORM
- EPOLLRDBAND
- EPOLLWRNORM
- EPOLLWRBAND
- EPOLLMSG
- EPOLLERR
- EPOLLHUP
- EPOLLRDHUP
P.S. Originally I was trying to paste the enum EPOLL_EVENTS in sys/epoll.h on my box here; stackoverflow thinks that I don't format the code block correctly although I wrapped it with pre and then code tag, any idea?
回答1:
The most glaring problem with epoll documentation is its failure to state in "bold caps" that epoll events, are, in fact, fully identical to poll (2) events. Indeed, on the kernel side epoll handles its events in terms of older poll event names:
#define POLLIN 0x0001 // EPOLLIN
#define POLLPRI 0x0002 // EPOLLPRI
#define POLLOUT 0x0004 // EPOLLOUT
#define POLLERR 0x0008 // EPOLLERR
#define POLLHUP 0x0010 // EPOLLHUP
#define POLLNVAL 0x0020 // unused in epoll
#define POLLRDNORM 0x0040 // EPOLLRDNORM
#define POLLRDBAND 0x0080 // EPOLLRDBAND
#define POLLWRNORM 0x0100 // EPOLLWRNORM
#define POLLWRBAND 0x0200 // EPOLLWRBAND
#define POLLMSG 0x0400 // EPOLLMSG
#define POLLREMOVE 0x1000 // unused in epoll
#define POLLRDHUP 0x2000 // EPOLLRDHUP
Then, a brief inspection of kernel source reveals that:
EPOLLINandEPOLLRDNORMare identical (epoll returnsEPOLLIN | EPOLLRDNORMwhen data is available for reading from the file descriptor).EPOLLOUTandEPOLLWRNORMare identical (epoll returnsEPOLLOUT | EPOLLWRNORMwhen buffer space is available for writing).EPOLLRDBANDandEPOLLWRBANDsignal availability of the out of band data on the descriptor (on some sockets this will be the data send withMSG_OOBflag passed to socket).EPOLLPRIis a modifier flag and always augments some other event (such asEPOLLERR). It's use is subsystem dependent, as it may mean somewhat different things depending on what purpose associated file descriptor serves.EPOLLMSGappears to be unused by the kernel and appears to serve no purpose.EPOLLRDHUPsignals that the peer had closed its side of the channel for reading, but may still receive data (handy to establish that no more request data is coming in).EPOLLHUPsignals that the peer had closed its side of the channel.
回答2:
All the critical work for epoll is done in the kernel, the user space API is just an interface. The previous thread on Why exactly does ePoll scale better than Poll? covers the details of how the kernel implements epoll is nice details.
As for a document describing the events and how they are triggered the epoll_ctl(2) man page covers each event, for example:
EPOLLIN
The associated file is available for read(2) operations.
EPOLLOUT
The associated file is available for write(2) operations.
For a better description of EPOLLET you need to read the epoll(7) man page.
This is a complete example of how to use epoll.
You use epoll_ctl to request which events you wish to receive events EPOLLIN and EPOLLET, the code above does this:
event.events = EPOLLIN | EPOLLET;
s = epoll_ctl (efd, EPOLL_CTL_ADD, infd, &event);
来源:https://stackoverflow.com/questions/18103093/how-does-a-socket-event-get-propagated-converted-to-epoll