In Python, is there any difference between creating a generator object through a generator expression versus using the yield statement?
Using yield
is nice if the expression is more complicated than just nested loops. Among other things you can return a special first or special last value. Consider:
def Generator(x):
for i in xrange(x):
yield(i)
yield(None)
Yes there is a difference.
For the generator expression (x for var in expr)
, iter(expr)
is called when the expression is created.
When using def
and yield
to create a generator, as in:
def my_generator():
for var in expr:
yield x
g = my_generator()
iter(expr)
is not yet called. It will be called only when iterating on g
(and might not be called at all).
Taking this iterator as an example:
from __future__ import print_function
class CountDown(object):
def __init__(self, n):
self.n = n
def __iter__(self):
print("ITER")
return self
def __next__(self):
if self.n == 0:
raise StopIteration()
self.n -= 1
return self.n
next = __next__ # for python2
This code:
g1 = (i ** 2 for i in CountDown(3)) # immediately prints "ITER"
print("Go!")
for x in g1:
print(x)
while:
def my_generator():
for i in CountDown(3):
yield i ** 2
g2 = my_generator()
print("Go!")
for x in g2: # "ITER" is only printed here
print(x)
Since most iterators do not do a lot of stuff in __iter__
, it is easy to miss this behavior. A real world example would be Django's QuerySet
, which fetch data in __iter__ and data = (f(x) for x in qs)
might take a lot of time, while def g(): for x in qs: yield f(x)
followed by data=g()
would return immediately.
For more info and the formal definition refer to PEP 289 -- Generator Expressions.