How do I process the elements of a sequence in batches, idiomatically?
For example, with the sequence \"abcdef\" and a batch size of 2, I would like to do something
I've got an alternative approach, that works for iterables that don't have a known length.
def groupsgen(seq, size):
it = iter(seq)
while True:
values = ()
for n in xrange(size):
values += (it.next(),)
yield values
It works by iterating over the sequence (or other iterator) in groups of size, collecting the values in a tuple. At the end of each group, it yield the tuple.
When the iterator runs out of values, it produces a StopIteration exception which is then propagated up, indicating that groupsgen is out of values.
It assumes that the values come in sets of size (sets of 2, 3, etc). If not, any values left over are just discarded.
And then there's always the documentation.
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
try:
b.next()
except StopIteration:
pass
return izip(a, b)
def grouper(n, iterable, padvalue=None):
"grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)
Note: these produce tuples instead of substrings, when given a string sequence as input.
Given
from __future__ import print_function # python 2.x
seq = "abcdef"
n = 2
Code
while seq:
print("{}".format(seq[:n]), end="\n")
seq = seq[n:]
Output
ab
cd
ef
Here is a solution, which yields a series of iterators, each of which iterates over n items.
def groupiter(thing, n):
def countiter(nextthing, thingiter, n):
yield nextthing
for _ in range(n - 1):
try:
nextitem = next(thingiter)
except StopIteration:
return
yield nextitem
thingiter = iter(thing)
while True:
try:
nextthing = next(thingiter)
except StopIteration:
return
yield countiter(nextthing, thingiter, n)
I use it as follows:
table = list(range(250))
for group in groupiter(table, 16):
print(' '.join('0x{:02X},'.format(x) for x in group))
Note that it can handle the length of the object not being a multiple of n.
The itertools doc has a recipe for this:
from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
Usage:
>>> l = [1,2,3,4,5,6,7,8,9]
>>> [z for z in grouper(l, 3)]
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]
you can create the following generator
def chunks(seq, size):
a = range(0, len(seq), size)
b = range(size, len(seq) + 1, size)
for i, j in zip(a, b):
yield seq[i:j]
and use it like this:
for i in chunks('abcdef', 2):
print(i)