Python multiprocessing: How can I RELIABLY redirect stdout from a child process?

前端 未结 5 616
不知归路
不知归路 2020-12-08 14:27

NB. I have seen Log output of multiprocessing.Process - unfortunately, it doesn\'t answer this question.

I am creating a child process (on windows) via multiprocessi

相关标签:
5条回答
  • 2020-12-08 14:37

    Here is the simple and straightforward way for capturing stdout for multiprocessing.Process:

    import app
    import io
    import sys
    from multiprocessing import Process
    
    
    def run_app(some_param):
        sys.stdout = io.TextIOWrapper(open(sys.stdout.fileno(), 'wb', 0), write_through=True)
        app.run()
    
    app_process = Process(target=run_app, args=('some_param',))
    app_process.start()
    # Use app_process.termninate() for python <= 3.7.
    app_process.kill() 
    
    0 讨论(0)
  • 2020-12-08 14:43

    I don't think you have a better option than redirecting a subprocess to a file as you mentioned in your comment.

    The way consoles stdin/out/err work in windows is each process when it's born has its std handles defined. You can change them with SetStdHandle. When you modify python's sys.stdout you only modify where python prints out stuff, not where other DLL's are printing stuff. Part of the CRT in your DLL is using GetStdHandle to find out where to print out to. If you want, you can do whatever piping you want in windows API in your DLL or in your python script with pywin32. Though I do think it'll be simpler with subprocess.

    0 讨论(0)
  • The solution you suggest is a good one: create your processes manually such that you have explicit access to their stdout/stderr file handles. You can then create a socket to communicate with the sub-process and use multiprocessing.connection over that socket (multiprocessing.Pipe creates the same type of connection object, so this should give you all the same IPC functionality).

    Here's a two-file example.

    master.py:

    import multiprocessing.connection
    import subprocess
    import socket
    import sys, os
    
    ## Listen for connection from remote process (and find free port number)
    port = 10000
    while True:
        try:
            l = multiprocessing.connection.Listener(('localhost', int(port)), authkey="secret")
            break
        except socket.error as ex:
            if ex.errno != 98:
                raise
            port += 1  ## if errno==98, then port is not available.
    
    proc = subprocess.Popen((sys.executable, "subproc.py", str(port)), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    
    ## open connection for remote process
    conn = l.accept()
    conn.send([1, "asd", None])
    print(proc.stdout.readline())
    

    subproc.py:

    import multiprocessing.connection
    import subprocess
    import sys, os, time
    
    port = int(sys.argv[1])
    conn = multiprocessing.connection.Client(('localhost', port), authkey="secret")
    
    while True:
        try:
            obj = conn.recv()
            print("received: %s\n" % str(obj))
            sys.stdout.flush()
        except EOFError:  ## connection closed
            break
    

    You may also want to see the first answer to this question to get non-blocking reads from the subprocess.

    0 讨论(0)
  • 2020-12-08 14:46

    In my situation I changed sys.stdout.write to write to a PySide QTextEdit. I couldn't read from sys.stdout and I didn't know how to change sys.stdout to be readable. I created two Pipes. One for stdout and the other for stderr. In the separate process I redirect sys.stdout and sys.stderr to the child connection of the multiprocessing pipe. On the main process I created two threads to read the stdout and stderr parent pipe and redirect the pipe data to sys.stdout and sys.stderr.

    import sys
    import contextlib
    import threading
    import multiprocessing as mp
    import multiprocessing.queues
    from queue import Empty
    import time
    
    
    class PipeProcess(mp.Process):
        """Process to pipe the output of the sub process and redirect it to this sys.stdout and sys.stderr.
    
        Note:
            The use_queue = True argument will pass data between processes using Queues instead of Pipes. Queues will
            give you the full output and read all of the data from the Queue. A pipe is more efficient, but may not
            redirect all of the output back to the main process.
        """
        def __init__(self, group=None, target=None, name=None, args=tuple(), kwargs={}, *_, daemon=None,
                     use_pipe=None, use_queue=None):
            self.read_out_th = None
            self.read_err_th = None
            self.pipe_target = target
            self.pipe_alive = mp.Event()
    
            if use_pipe or (use_pipe is None and not use_queue):  # Default
                self.parent_stdout, self.child_stdout = mp.Pipe(False)
                self.parent_stderr, self.child_stderr = mp.Pipe(False)
            else:
                self.parent_stdout = self.child_stdout = mp.Queue()
                self.parent_stderr = self.child_stderr = mp.Queue()
    
            args = (self.child_stdout, self.child_stderr, target) + tuple(args)
            target = self.run_pipe_out_target
    
            super(PipeProcess, self).__init__(group=group, target=target, name=name, args=args, kwargs=kwargs,
                                              daemon=daemon)
    
        def start(self):
            """Start the multiprocess and reading thread."""
            self.pipe_alive.set()
            super(PipeProcess, self).start()
    
            self.read_out_th = threading.Thread(target=self.read_pipe_out,
                                                args=(self.pipe_alive, self.parent_stdout, sys.stdout))
            self.read_err_th = threading.Thread(target=self.read_pipe_out,
                                                args=(self.pipe_alive, self.parent_stderr, sys.stderr))
            self.read_out_th.daemon = True
            self.read_err_th.daemon = True
            self.read_out_th.start()
            self.read_err_th.start()
    
        @classmethod
        def run_pipe_out_target(cls, pipe_stdout, pipe_stderr, pipe_target, *args, **kwargs):
            """The real multiprocessing target to redirect stdout and stderr to a pipe or queue."""
            sys.stdout.write = cls.redirect_write(pipe_stdout)  # , sys.__stdout__)  # Is redirected in main process
            sys.stderr.write = cls.redirect_write(pipe_stderr)  # , sys.__stderr__)  # Is redirected in main process
    
            pipe_target(*args, **kwargs)
    
        @staticmethod
        def redirect_write(child, out=None):
            """Create a function to write out a pipe and write out an additional out."""
            if isinstance(child, mp.queues.Queue):
                send = child.put
            else:
                send = child.send_bytes  # No need to pickle with child_conn.send(data)
    
            def write(data, *args):
                try:
                    if isinstance(data, str):
                        data = data.encode('utf-8')
    
                    send(data)
                    if out is not None:
                        out.write(data)
                except:
                    pass
            return write
    
        @classmethod
        def read_pipe_out(cls, pipe_alive, pipe_out, out):
            if isinstance(pipe_out, mp.queues.Queue):
                # Queue has better functionality to get all of the data
                def recv():
                    return pipe_out.get(timeout=0.5)
    
                def is_alive():
                    return pipe_alive.is_set() or pipe_out.qsize() > 0
            else:
                # Pipe is more efficient
                recv = pipe_out.recv_bytes  # No need to unpickle with data = pipe_out.recv()
                is_alive = pipe_alive.is_set
    
            # Loop through reading and redirecting data
            while is_alive():
                try:
                    data = recv()
                    if isinstance(data, bytes):
                        data = data.decode('utf-8')
                    out.write(data)
                except EOFError:
                    break
                except Empty:
                    pass
                except:
                    pass
    
        def join(self, *args):
            # Wait for process to finish (unless a timeout was given)
            super(PipeProcess, self).join(*args)
    
            # Trigger to stop the threads
            self.pipe_alive.clear()
    
            # Pipe must close to prevent blocking and waiting on recv forever
            if not isinstance(self.parent_stdout, mp.queues.Queue):
                with contextlib.suppress():
                    self.parent_stdout.close()
                with contextlib.suppress():
                    self.parent_stderr.close()
    
            # Close the pipes and threads
            with contextlib.suppress():
                self.read_out_th.join()
            with contextlib.suppress():
                self.read_err_th.join()
    
    def run_long_print():
        for i in range(1000):
            print(i)
            print(i, file=sys.stderr)
    
        print('finished')
    
    
    if __name__ == '__main__':
        # Example test write (My case was a QTextEdit)
        out = open('stdout.log', 'w')
        err = open('stderr.log', 'w')
    
        # Overwrite the write function and not the actual stdout object to prove this works
        sys.stdout.write = out.write
        sys.stderr.write = err.write
    
        # Create a process that uses pipes to read multiprocess output back into sys.stdout.write
        proc = PipeProcess(target=run_long_print, use_queue=True)  # If use_pipe=True Pipe may not write out all values
        # proc.daemon = True  # If daemon and use_queue Not all output may be redirected to stdout
        proc.start()
    
        # time.sleep(5)  # Not needed unless use_pipe or daemon and all of stdout/stderr is desired
    
        # Close the process
        proc.join()  # For some odd reason this blocks forever when use_queue=False
    
        # Close the output files for this test
        out.close()
        err.close()
    
    0 讨论(0)
  • 2020-12-08 14:48

    I assume I'm off base and missing something, but for what it's worth here is what came to mind when I read your question.

    If you can intercept all of the stdout and stderr (I got that impression from your question), then why not add or wrap that capture functionality around each of your processes? Then send what is captured through a queue to a consumer that can do whatever you want with all of the outputs?

    0 讨论(0)
提交回复
热议问题