How to capture streaming output in python from subprocess.communicate()

后端 未结 2 643
执念已碎
执念已碎 2021-01-06 11:21

Currently, I have something like this:

self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE)
out, err = self.process.communicate()
2条回答
  •  粉色の甜心
    2021-01-06 11:41

    I can think of a few solutions.

    #1: You can just go into the source to grab the code for communicate, copy and paste it, adding in code that prints each line as it comes in as well as buffering things up. (If its possible for your own stdout to block because of, say, a deadlocked parent, you can use a threading.Queue or something instead.) This is obviously a bit hacky, but it's pretty easy, and will be safe.

    But really, communicate is complicated because it needs to be fully general, and handle cases you don't. All you need here is the central trick: throw threads at the problem. A dedicated reader thread that doesn't do anything slow or blocking between read calls is all you need.

    Something like this:

    self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE)
    lines = []
    def reader():
        for line in self.process.stdout:
            lines.append(line)
            sys.stdout.write(line)
    t = threading.Thread(target=reader)
    t.start()
    self.process.wait()
    t.join()
    

    You may need some error handling in the reader thread. And I'm not 100% sure you can safely use readline here. But this will either work, or be close.

    #2: Or you can create a wrapper class that takes a file object and tees to stdout/stderr every time anyone reads from it. Then create the pipes manually, and pass in wrapped pipes, instead of using the automagic PIPE. This has the exact same issues as #1 (meaning either no issues, or you need to use a Queue or something if sys.stdout.write can block).

    Something like this:

    class TeeReader(object):
        def __init__(self, input_file, tee_file):
            self.input_file = input_file
            self.tee_file = tee_file
        def read(self, size=-1):
            ret = self.input_file.read(size)
            if ret:
                self.tee_file.write(ret)
            return ret
    

    In other words, it wraps a file object (or something that acts like one), and acts like a file object. (When you use PIPE, process.stdout is a real file object on Unix, but may just be something that acts like on on Windows.) Any other methods you need to delegate to input_file can probably be delegated directly, without any extra wrapping. Either try this and see what methods communicate gets AttributeExceptions looking for and code those those explicitly, or do the usual __getattr__ trick to delegate everything. PS, if you're worried about this "file object" idea meaning disk storage, read Everything is a file at Wikipedia.

    #3: Finally, you can grab one of the "async subprocess" modules on PyPI or included in twisted or other async frameworks and use that. (This makes it possible to avoid the deadlock problems, but it's not guaranteed—you still have to make sure to services the pipes properly.)

提交回复
热议问题