Why exactly does ePoll scale better than Poll?

后端 未结 2 1219
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-13 22:57

Short question but for me its difficult to understand.

Why exactly does ePoll scale better than Poll?

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-02-13 23:38

    The poll system call needs to copy your list of file descriptors to the kernel each time. This happens only once with epoll_ctl, but not every time you call epoll_wait.

    Also, epoll_wait is O(1) in respect of the number of descriptors watched1, which means it does not matter whether you wait on one descriptor or on 5,000 or 50,000 descriptors. poll, while being more efficient than select, still has to walk over the list every time (i.e. it is O(N) in respect of number of descriptors).

    And lastly, epoll can in addition to the "normal" mode work in "edge triggered" mode, which means the kernel does not need keep track of how much data you've read after you've been signalled readiness. This mode is more difficult to grasp, but somewhat more efficient.


    1As correctly pointed out by David Schwartz, epoll_wait is of course still O(N) in respect of events that occur. There is hardly a way it could be any different, with any interface. If N events happen on a descriptor that is watched, then the application needs to get N notifications, and needs to do N "things" in order to react on what's happening.
    This is again slightly, but not fundamentally different in edge triggered mode, where you actually get M events with M <= N. In edge triggered mode, when the same event (say, POLLIN) happens several times, you will probably get fewer notifications, possibly only a single one. However, this doesn't change much about the big-O notation as such.

    However, epoll_wait is irrespective of the number of descriptors watched. Under the assumption that it is used in the intended, "normal" way (that is, many descriptors, few events), this is what really matters, and here it is indeed O(1).

    As an analogy, you can think of a hash table. A hash table accesses its content in O(1), but one could argue that calculating the hash is actually O(N) in respect of the key length. This is technically absolutely correct, and there probably exist cases where this is a problem, however, for most people, this just doesn't matter.

提交回复
热议问题