How to speed up communication with subprocesses

后端 未结 5 844
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-02 02:11

I am using Python 2 subprocess with threading threads to take standard input, process it with binaries A, B, and C<

5条回答
  •  春和景丽
    2021-01-02 02:23

    This scenario is particularly well suited for a pipeline, where parallelism is implicitly managed by the OS. Since you are after a one-script solution, here you are:

    #! /usr/bin/python2
    
    import sys
    import subprocess
    import pipes
    
    # Define these as needed
    
    def produceA(input, output):
        output.write(input.read())
    
    def produceB(input, output):
        output.write(input.read())
    
    def produceC(input, output):
        output.write(input.read())
    
    # Magic starts here
    
    COMMAND = "{me} prepare_A | A - | {me} A_to_B | B - | {me} B_to_C | C -"
    
    
    def bootstrap(input, output):
        """Prepares and runs the pipeline."""
        me = "./{}".format(pipes.quote(__file__))
        subprocess.call(
            COMMAND.format(me=me), 
            stdin=input, stdout=output, shell=True, bufsize=-1
        )
    
    
    if __name__ == '__main__':
        ACTIONS = {
            "prepare_A": produceA,
               "A_to_B": produceB,
               "B_to_C": produceC
        }
    
        action = ACTIONS[sys.argv[1]] if len(sys.argv) > 1 else bootstrap
    
        action(sys.stdin, sys.stdout)
    

    This script will setup a pipeline or run of one of the produce functions, depending on the specified command.

    Make it executable and run it without arguments:

    ./A_to_C.py < A.txt > C.txt
    

    Note: it seems like you are using Python 2.6, so this solution is for Python 2.x, although it should run fine in Python 3.x, except that the quote function has been moved to shlex since Python 3.3

提交回复
热议问题