How not to miss the next element after itertools.takewhile()

后端 未结 2 1700
醉酒成梦
醉酒成梦 2020-12-15 06:18

Say we wish to process an iterator and want to handle it by chunks.
The logic per chunk depends on previously-calculated chunks, so groupby() does not help.

相关标签:
2条回答
  • 2020-12-15 06:41

    Given the callable GetNewChunkLogic() will report True along first chunk and False afterward.
    The following snippet

    1. solves the 'additional next step' problem of takewhile .
    2. is elegant because you don't have to implement the back-one-step logic .

    def partition(pred, iterable):
        'Use a predicate to partition entries into true entries and false entries'
        # partition(is_odd, range(10)) -->  1 3 5 7 9 and 0 2 4 6 8
        t1, t2 = tee(iterable)
        return filter(pred, t1), filterfalse(pred, t2)
    
    while True:
        head, tail = partition(GetNewChunkLogic(), myIterator)
        process(head)
        myIterator = tail
    

    However, the most elegant way is to modify your GetNewChunkLogic into a generator and remove the while loop.

    0 讨论(0)
  • 2020-12-15 06:53

    takewhile() indeed needs to look at the next element to determine when to toggle behaviour.

    You could use a wrapper that tracks the last seen element, and that can be 'reset' to back up one element:

    _sentinel = object()
    
    class OneStepBuffered(object):
        def __init__(self, it):
            self._it = iter(it)
            self._last = _sentinel
            self._next = _sentinel
        def __iter__(self):
            return self
        def __next__(self):
            if self._next is not _sentinel:
                next_val, self._next = self._next, _sentinel
                return next_val
            try:
                self._last = next(self._it)
                return self._last
            except StopIteration:
                self._last = self._next = _sentinel
                raise
        next = __next__  # Python 2 compatibility
        def step_back(self):
            if self._last is _sentinel:
                raise ValueError("Can't back up a step")
            self._next, self._last = self._last, _sentinel
    

    Wrap your iterator in this one before using it with takewhile():

    myIterator = OneStepBuffered(myIterator)
    while True:
        chunk = itertools.takewhile(getNewChunkLogic(), myIterator)
        process(chunk)
        myIterator.step_back()
    

    Demo:

    >>> from itertools import takewhile
    >>> test_list = range(10)
    >>> iterator = OneStepBuffered(test_list)
    >>> list(takewhile(lambda i: i < 5, iterator))
    [0, 1, 2, 3, 4]
    >>> iterator.step_back()
    >>> list(iterator)
    [5, 6, 7, 8, 9]
    
    0 讨论(0)
提交回复
热议问题