How do I process the elements of a sequence in batches, idiomatically?
For example, with the sequence \"abcdef\" and a batch size of 2, I would like to do something
I am sure someone is going to come up with some more "Pythonic" but how about:
for y in range(0, len(x), 2):
print "%s%s" % (x[y], x[y+1])
Note that this would only work if you know that len(x) % 2 == 0;
but the more general way would be (inspired by this answer):
for i in zip(*(seq[i::size] for i in range(size))):
print(i) # tuple of individual values
A generator function would be neat:
def batch_gen(data, batch_size):
for i in range(0, len(data), batch_size):
yield data[i:i+batch_size]
Example use:
a = "abcdef"
for i in batch_gen(a, 2): print i
prints:
ab
cd
ef
s = 'abcdefgh'
for e in (s[i:i+2] for i in range(0,len(s),2)):
print(e)
>>> a = "abcdef"
>>> size = 2
>>> [a[x:x+size] for x in range(0, len(a), size)]
['ab', 'cd', 'ef']
..or, not as a list comprehension:
a = "abcdef"
size = 2
output = []
for x in range(0, len(a), size):
output.append(a[x:x+size])
Or, as a generator, which would be best if used multiple times (for a one-use thing, the list comprehension is probably "best"):
def chunker(thelist, segsize):
for x in range(0, len(thelist), segsize):
yield thelist[x:x+segsize]
..and it's usage:
>>> for seg in chunker(a, 2):
... print seg
...
ab
cd
ef
Except for two answers I saw a lot of premature materialization on the batches, and subscripting (which does not work for all iterators). Hence I came up with this alternative:
def iter_x_and_n(iterable, x, n):
yield x
try:
for _ in range(n):
yield next(iterable)
except StopIteration:
pass
def batched(iterable, n):
if n<1: raise ValueError("Can not create batches of size %d, number must be strictly positive" % n)
iterable = iter(iterable)
try:
for x in iterable:
yield iter_x_and_n(iterable, x, n-1)
except StopIteration:
pass
It beats me that there is no one-liner or few-liner solution for this (to the best of my knowleged). The key issue is that both the outer generator and the inner generator need to handle the StopIteration correctly. The outer generator should only yield something if there is at least one element left. The intuitive way to check this, is to execute a next(...) and catch a StopIteration.