Fastest (most Pythonic) way to consume an iterator

前端 未结 2 589
孤城傲影
孤城傲影 2020-12-15 20:48

I am curious what the fastest way to consume an iterator would be, and the most Pythonic way.

For example, say that I want to create an iterator with the map

2条回答
  •  Happy的楠姐
    2020-12-15 21:12

    While you shouldn't be creating a map object just for side effects, there is in fact a standard recipe for consuming iterators in the itertools docs:

    def consume(iterator, n=None):
        "Advance the iterator n-steps ahead. If n is None, consume entirely."
        # Use functions that consume iterators at C speed.
        if n is None:
            # feed the entire iterator into a zero-length deque
            collections.deque(iterator, maxlen=0)
        else:
            # advance to the empty slice starting at position n
            next(islice(iterator, n, n), None)
    

    For just the "consume entirely" case, this can be simplified to

    def consume(iterator):
        collections.deque(iterator, maxlen=0)
    

    Using collections.deque this way avoids storing all the elements (because maxlen=0) and iterates at C speed, without bytecode interpretation overhead. There's even a dedicated fast path in the deque implementation for using a maxlen=0 deque to consume an iterator.

    Timing:

    In [1]: import collections
    
    In [2]: x = range(1000)
    
    In [3]: %%timeit
       ...: i = iter(x)
       ...: for _ in i:
       ...:     pass
       ...: 
    16.5 µs ± 829 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    In [4]: %%timeit
       ...: i = iter(x)
       ...: collections.deque(i, maxlen=0)
       ...: 
    12 µs ± 566 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    

    Of course, this is all based on CPython. The entire nature of interpreter overhead is very different on other Python implementations, and the maxlen=0 fast path is specific to CPython. See abarnert's answer for other Python implementations.

提交回复
热议问题