Say we wish to process an iterator and want to handle it by chunks.
The logic per chunk depends on previously-calculated chunks, so groupby()
does not help.
Given the callable GetNewChunkLogic()
will report True
along first chunk and False
afterward.
The following snippet
takewhile
.def partition(pred, iterable):
'Use a predicate to partition entries into true entries and false entries'
# partition(is_odd, range(10)) --> 1 3 5 7 9 and 0 2 4 6 8
t1, t2 = tee(iterable)
return filter(pred, t1), filterfalse(pred, t2)
while True:
head, tail = partition(GetNewChunkLogic(), myIterator)
process(head)
myIterator = tail
However, the most elegant way is to modify your GetNewChunkLogic
into a generator and remove the while
loop.
takewhile()
indeed needs to look at the next element to determine when to toggle behaviour.
You could use a wrapper that tracks the last seen element, and that can be 'reset' to back up one element:
_sentinel = object()
class OneStepBuffered(object):
def __init__(self, it):
self._it = iter(it)
self._last = _sentinel
self._next = _sentinel
def __iter__(self):
return self
def __next__(self):
if self._next is not _sentinel:
next_val, self._next = self._next, _sentinel
return next_val
try:
self._last = next(self._it)
return self._last
except StopIteration:
self._last = self._next = _sentinel
raise
next = __next__ # Python 2 compatibility
def step_back(self):
if self._last is _sentinel:
raise ValueError("Can't back up a step")
self._next, self._last = self._last, _sentinel
Wrap your iterator in this one before using it with takewhile()
:
myIterator = OneStepBuffered(myIterator)
while True:
chunk = itertools.takewhile(getNewChunkLogic(), myIterator)
process(chunk)
myIterator.step_back()
Demo:
>>> from itertools import takewhile
>>> test_list = range(10)
>>> iterator = OneStepBuffered(test_list)
>>> list(takewhile(lambda i: i < 5, iterator))
[0, 1, 2, 3, 4]
>>> iterator.step_back()
>>> list(iterator)
[5, 6, 7, 8, 9]