Fastest (most Pythonic) way to consume an iterator

前端 未结 2 594
孤城傲影
孤城傲影 2020-12-15 20:48

I am curious what the fastest way to consume an iterator would be, and the most Pythonic way.

For example, say that I want to create an iterator with the map

2条回答
  •  余生分开走
    2020-12-15 21:18

    If you only care about CPython, deque is the fastest way, as demonstrated in user2357112's answer.1 And the same thing has been demonstrated in 2.7 and 3.2, and 32- vs. 64-bit, and Windows vs. Linux, and so on.

    But that relies on an optimization in CPython's C implementation of deque. Other implementations may have no such optimization, which means they end up calling an append for each element.

    In PyPy in particular, there is no such optimization in the source,2 and the JIT cannot optimize that no-op append out. (And it's hard to see how it couldn't require at least a guard test each time through the loop.) Of course compared to the cost of looping in Python… right? But looping in Python is blazing fast in PyPy, almost as fast as a C loop in CPython, so this actually makes a huge difference.

    Comparing the times (using identical tests as in user's answer:3

              for      deque
    CPython   19.7us   12.7us
    PyPy       1.37us  23.3us
    

    There's no 3.x versions of the other major interpreters, and I don't have IPython for any of them, but a quick test with Jython shows similar effects.

    So, the fastest portable implementation is something like:

    if sys.implementation.name == 'cpython':
        import collections
        def consume(it):
            return collections.deque(it, maxlen=0)
    else:
        def consume(it):
            for _ in it:
                pass
    

    This of course gives me 12.7us in CPython, and 1.41us in PyPy.


    1. Of course you could write a custom C extension, but it's only going to be faster by a tiny constant term—you can avoid the constructor call and the test before jumping to the fast path, but once you get into that loop, you have to do exactly what it's doing.

    2. Tracing through PyPy source is always fun… but I think it ends up in the W_Deque class that's, which is part of the builtin _collections module.

    3. CPython 3.6.4; PyPy 5.10.1/3.5.3; both from the respective standard 64-bit macOS installers.

提交回复
热议问题