I am using Python 2 subprocess
with threading
threads to take standard input, process it with binaries A
, B
, and C<
This scenario is particularly well suited for a pipeline, where parallelism is implicitly managed by the OS. Since you are after a one-script solution, here you are:
#! /usr/bin/python2
import sys
import subprocess
import pipes
# Define these as needed
def produceA(input, output):
output.write(input.read())
def produceB(input, output):
output.write(input.read())
def produceC(input, output):
output.write(input.read())
# Magic starts here
COMMAND = "{me} prepare_A | A - | {me} A_to_B | B - | {me} B_to_C | C -"
def bootstrap(input, output):
"""Prepares and runs the pipeline."""
me = "./{}".format(pipes.quote(__file__))
subprocess.call(
COMMAND.format(me=me),
stdin=input, stdout=output, shell=True, bufsize=-1
)
if __name__ == '__main__':
ACTIONS = {
"prepare_A": produceA,
"A_to_B": produceB,
"B_to_C": produceC
}
action = ACTIONS[sys.argv[1]] if len(sys.argv) > 1 else bootstrap
action(sys.stdin, sys.stdout)
This script will setup a pipeline or run of one of the produce
functions, depending on the specified command.
Make it executable and run it without arguments:
./A_to_C.py < A.txt > C.txt
Note: it seems like you are using Python 2.6, so this solution is for Python 2.x, although it should run fine in Python 3.x, except that the quote
function has been moved to shlex
since Python 3.3