问题
I'm looking for a way to "page through" a Python iterator. That is, I would like to wrap a given iterator iter and page_size with another iterator that would would return the items from iter as a series of "pages". Each page would itself be an iterator with up to page_size iterations.
I looked through itertools and the closest thing I saw is itertools.islice. In some ways, what I'd like is the opposite of itertools.chain -- instead of chaining a series of iterators together into one iterator, I'd like to break an iterator up into a series of smaller iterators. I was expecting to find a paging function in itertools but couldn't locate one.
I came up with the following pager class and demonstration.
class pager(object):
"""
takes the iterable iter and page_size to create an iterator that "pages through" iter. That is, pager returns a series of page iterators,
each returning up to page_size items from iter.
"""
def __init__(self,iter, page_size):
self.iter = iter
self.page_size = page_size
def __iter__(self):
return self
def next(self):
# if self.iter has not been exhausted, return the next slice
# I'm using a technique from
# https://stackoverflow.com/questions/1264319/need-to-add-an-element-at-the-start-of-an-iterator-in-python
# to check for iterator completion by cloning self.iter into 3 copies:
# 1) self.iter gets advanced to the next page
# 2) peek is used to check on whether self.iter is done
# 3) iter_for_return is to create an independent page of the iterator to be used by caller of pager
self.iter, peek, iter_for_return = itertools.tee(self.iter, 3)
try:
next_v = next(peek)
except StopIteration: # catch the exception and then raise it
raise StopIteration
else:
# consume the page from the iterator so that the next page is up in the next iteration
# is there a better way to do this?
#
for i in itertools.islice(self.iter,self.page_size): pass
return itertools.islice(iter_for_return,self.page_size)
iterator_size = 10
page_size = 3
my_pager = pager(xrange(iterator_size),page_size)
# skip a page, then print out rest, and then show the first page
page1 = my_pager.next()
for page in my_pager:
for i in page:
print i
print "----"
print "skipped first page: " , list(page1)
I'm looking for some feedback and have the following questions:
- Is there a pager already in itertools that serves a pager that I'm overlooking?
- Cloning self.iter 3 times seems kludgy to me. One clone is to check whether self.iter has any more items. I decided to go with a technique Alex Martelli suggested (aware that he wrote of a wrapping technique). The second clone was to enable the returned page to be independent of the internal iterator (self.iter). Is there a way to avoid making 3 clones?
- Is there a better way to deal with the StopIteration exception beside catching it and then raising it again? I am tempted to not catch it at all and let it bubble up.
Thanks! -Raymond
回答1:
Why aren't you using this?
def grouper( page_size, iterable ):
page= []
for item in iterable:
page.append( item )
if len(page) == page_size:
yield page
page= []
yield page
"Each page would itself be an iterator with up to page_size" items. Each page is a simple list of items, which is iterable. You could use yield iter(page)
to yield the iterator instead of the object, but I don't see how that improves anything.
It throws a standard StopIteration
at the end.
What more would you want?
回答2:
Look at grouper()
in the itertools recipes.
回答3:
I'd do it like this:
def pager(iterable, page_size):
args = [iter(iterable)] * page_size
fillvalue = object()
for group in izip_longest(fillvalue=fillvalue, *args):
yield (elem for elem in group if elem is not fillvalue)
That way, None
can be a legitimate value that the iterator spits out. Only the single object fillvalue
filtered out, and it cannot possibly be an element of the iterable.
回答4:
Based on the pointer to the itertools recipe for grouper(), I came up with the following adaption of grouper() to mimic Pager. I wanted to filter out any None results and wanted to return an iterator rather than a tuple (though I suspect that there might be little advantage in doing this conversion)
# based on http://docs.python.org/library/itertools.html#recipes
def grouper2(n, iterable, fillvalue=None):
args = [iter(iterable)] * n
for item in izip_longest(fillvalue=fillvalue, *args):
yield iter(filter(None,item))
I'd welcome feedback on how what I can do to improve this code.
回答5:
def group_by(iterable, size):
"""Group an iterable into lists that don't exceed the size given.
>>> group_by([1,2,3,4,5], 2)
[[1, 2], [3, 4], [5]]
"""
sublist = []
for index, item in enumerate(iterable):
if index > 0 and index % size == 0:
yield sublist
sublist = []
sublist.append(item)
if sublist:
yield sublist
来源:https://stackoverflow.com/questions/2348317/how-to-write-a-pager-for-python-iterators