Python subprocess in parallel

后端 未结 4 1382
时光取名叫无心
时光取名叫无心 2020-11-29 02:17

I want to run many processes in parallel with ability to take stdout in any time. How should I do it? Do I need to run thread for each subprocess.Popen() call,

4条回答
  •  温柔的废话
    2020-11-29 02:47

    You can do it in a single thread.

    Suppose you have a script that prints lines at random times:

    #!/usr/bin/env python
    #file: child.py
    import os
    import random
    import sys
    import time
    
    for i in range(10):
        print("%2d %s %s" % (int(sys.argv[1]), os.getpid(), i))
        sys.stdout.flush()
        time.sleep(random.random())
    

    And you'd like to collect the output as soon as it becomes available, you could use select on POSIX systems as @zigg suggested:

    #!/usr/bin/env python
    from __future__ import print_function
    from select     import select
    from subprocess import Popen, PIPE
    
    # start several subprocesses
    processes = [Popen(['./child.py', str(i)], stdout=PIPE,
                       bufsize=1, close_fds=True,
                       universal_newlines=True)
                 for i in range(5)]
    
    # read output
    timeout = 0.1 # seconds
    while processes:
        # remove finished processes from the list (O(N**2))
        for p in processes[:]:
            if p.poll() is not None: # process ended
                print(p.stdout.read(), end='') # read the rest
                p.stdout.close()
                processes.remove(p)
    
        # wait until there is something to read
        rlist = select([p.stdout for p in processes], [],[], timeout)[0]
    
        # read a line from each process that has output ready
        for f in rlist:
            print(f.readline(), end='') #NOTE: it can block
    

    A more portable solution (that should work on Windows, Linux, OSX) can use reader threads for each process, see Non-blocking read on a subprocess.PIPE in python.

    Here's os.pipe()-based solution that works on Unix and Windows:

    #!/usr/bin/env python
    from __future__ import print_function
    import io
    import os
    import sys
    from subprocess import Popen
    
    ON_POSIX = 'posix' in sys.builtin_module_names
    
    # create a pipe to get data
    input_fd, output_fd = os.pipe()
    
    # start several subprocesses
    processes = [Popen([sys.executable, 'child.py', str(i)], stdout=output_fd,
                       close_fds=ON_POSIX) # close input_fd in children
                 for i in range(5)]
    os.close(output_fd) # close unused end of the pipe
    
    # read output line by line as soon as it is available
    with io.open(input_fd, 'r', buffering=1) as file:
        for line in file:
            print(line, end='')
    #
    for p in processes:
        p.wait()
    

提交回复
热议问题