问题
I'm using python and I'm trying to find a way to elegantly chain multiple generators together. An example of the problem would be having for instance a root generator which provides some kind of data and each value being passed to its "children" like a cascade which in turn may modify the object they receive. I could go for this route:
for x in gen1:
gen2(x)
gen3(x)
But it's ugly and not elegant. I was thinking about a more functional way of doing things.
回答1:
You could turn the generators into coroutines so they can send()
and receive values from one another (using (yield)
expressions). This will give each one a chance to change the values they receive, and/or pass them along to the next generator/coroutine (or ignore them completely).
Note in the sample code below, I use a decorator named coroutine
to "prime" the generator/coroutine functions. This causes them execute as far as just before their first yield
expression/statement. It's a slightly modified version of one shown in this youtube video of a very illuminating talk titled A Curious Course on Coroutines and Concurrency that Dave Beazley gave at PyCon 2009.
As you should be able to see from the output produced, the data values are being processed by each pipeline that was configured via a single send()
to the head coroutine, which then effectively "multiplexes" it down each pipeline. Since each sub-coroutine also does this, it would be possible to set up an elaborate "tree" of processes.
import sys
def coroutine(func):
""" Decorator to "prime" generators used as coroutines. """
def start(*args,**kwargs):
cr = func(*args,**kwargs) # Create coroutine generator function.
next(cr) # Advance to just before its first yield.
return cr
return start
def pipe(name, value, divisor, coroutines):
""" Utility function to send values to list of coroutines. """
print(' {}: {} is divisible by {}'.format(name, value, divisor))
for cr in coroutines:
cr.send(value)
def this_func_name():
""" Helper function that returns name of function calling it. """
frame = sys._getframe(1)
return frame.f_code.co_name
@coroutine
def gen1(*coroutines):
while True:
value = (yield) # Receive values sent here via "send()".
if value % 2 == 0: # Only pipe even values.
pipe(this_func_name(), value, 2, coroutines)
@coroutine
def gen2(*coroutines):
while True:
value = (yield) # Receive values sent here via "send()".
if value % 4 == 0: # Only pipe values divisible by 4.
pipe(this_func_name(), value, 4, coroutines)
@coroutine
def gen3(*coroutines):
while True:
value = (yield) # Receive values sent here via "send()".
if value % 6 == 0: # Only pipe values divisible by 6.
pipe(this_func_name(), value, 6, coroutines)
# Create and link together some coroutine pipelines.
g3 = gen3()
g2 = gen2()
g1 = gen1(g2, g3)
# Send values through both pipelines (g1 -> g2, and g1 -> g3) of coroutines.
for value in range(17):
print('piping {}'.format(value))
g1.send(value)
Output:
piping 0
gen1: 0 is divisible by 2
gen2: 0 is divisible by 4
gen3: 0 is divisible by 6
piping 1
piping 2
gen1: 2 is divisible by 2
piping 3
piping 4
gen1: 4 is divisible by 2
gen2: 4 is divisible by 4
piping 5
piping 6
gen1: 6 is divisible by 2
gen3: 6 is divisible by 6
piping 7
piping 8
gen1: 8 is divisible by 2
gen2: 8 is divisible by 4
piping 9
piping 10
gen1: 10 is divisible by 2
piping 11
piping 12
gen1: 12 is divisible by 2
gen2: 12 is divisible by 4
gen3: 12 is divisible by 6
piping 13
piping 14
gen1: 14 is divisible by 2
piping 15
piping 16
gen1: 16 is divisible by 2
gen2: 16 is divisible by 4
回答2:
A pipeline might look more like this:
for x in gen3(gen2(gen1())):
print x
For example:
for i, x in enumerate(range(10)):
print i, x
There isn't a way to fork (or "tee") a pipeline in Python. If you want multiple pipelines, you'll have to duplicate them: gen2(gen1())
and gen3(gen1())
.
回答3:
Dave Beazley gave this example in a talk he did in 2008. The goal was to sum how many bytes of data were transferred in an Apache web server log. Assuming a log format such as:
81.107.39.38 - ... "GET /ply/ HTTP/1.1" 200 7587
81.107.39.38 - ... "GET /favicon.ico HTTP/1.1" 404 133
81.107.39.38 - ... "GET /admin HTTP/1.1" 403 -
A traditional (non-generator) solution might look like this:
with open("access-log") as wwwlog:
total = 0
for line in wwwlog:
bytes_as_str = line.rsplit(None,1)[1]
if bytes_as_str != '-':
total += int(bytes_as_str)
print("Total: {}".format(total))
A generator pipeline using generator expressions for this could be visualized as:
access-log => wwwlog => bytecolumn => bytes => sum() => total
And could look like:
with open("access-log") as wwwlog:
bytecolumn = (line.rsplit(None,1)[1] for line in wwwlog)
bytes = (int(x) for x in bytecolumn if x != '-')
print("Total: {}".format(sum(bytes)))
Dave Beazley's slides and more examples are available on his website. His later presentations elucidate this further.
It's hard to say much more without knowing exactly what you're trying to do so we can assess if a custom generator is even needed for each thing you're doing (generator expressions / comprehensions could work fine for many things without the need to declare generator functions).
回答4:
Here is a succinct example to throw in the mix:
def negate_nums(g):
for x in g:
yield -x
def square_nums(g):
for x in g:
yield x ** 2
def half_num(g):
for x in g:
yield x / 2.0
def compose_gens(first_gen,*rest_gens):
newg = first_gen(compose_gens(*rest_gens)) if rest_gens else first_gen
return newg
for x in compose_gens(negate_nums,square_nums,half_num,range(10)):
print(x)
Here you are composing generators, so that they are called right to left in the final compose_gens
call. You could change it to a pipeline by reversing the args.
来源:https://stackoverflow.com/questions/45924087/how-do-i-form-multiple-pipelines-with-generators