Python read from subprocess stdout and stderr separately while preserving order

我的梦境 提交于 2019-11-26 16:19:21

Here's a solution based on selectors, but one that preserves order, and streams variable-length characters (even single chars).

The trick is to use read1(), instead of read().

import selectors
import subprocess
import sys

p = subprocess.Popen(
    "python random_out.py", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
)

sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)

while True:
    for key, _ in sel.select():
        data = key.fileobj.read1().decode()
        if not data:
            exit()
        if key.fileobj is p.stdout:
            print(data, end="")
        else:
            print(data, end="", file=sys.stderr)

If you want a test program, use this.

import sys
from time import sleep


for i in range(10):
    print(f" x{i} ", file=sys.stderr, end="")
    sleep(0.1)
    print(f" y{i} ", end="")
    sleep(0.1)
jfs

The code in your question may deadlock if the child process produces enough output on stderr (~100KB on my Linux machine).

There is a communicate() method that allows to read from both stdout and stderr separately:

from subprocess import Popen, PIPE

process = Popen(command, stdout=PIPE, stderr=PIPE)
output, err = process.communicate()

If you need to read the streams while the child process is still running then the portable solution is to use threads (not tested):

from subprocess import Popen, PIPE
from threading import Thread
from Queue import Queue # Python 2

def reader(pipe, queue):
    try:
        with pipe:
            for line in iter(pipe.readline, b''):
                queue.put((pipe, line))
    finally:
        queue.put(None)

process = Popen(command, stdout=PIPE, stderr=PIPE, bufsize=1)
q = Queue()
Thread(target=reader, args=[process.stdout, q]).start()
Thread(target=reader, args=[process.stderr, q]).start()
for _ in range(2):
    for source, line in iter(q.get, None):
        print "%s: %s" % (source, line),

See:

The order in which a process writes data to different pipes is lost after write.

There is no way you can tell if stdout has been written before stderr.

You can try to read data simultaneously from multiple file descriptors in a non-blocking way as soon as data is available, but this would only minimize the probability that the order is incorrect.

This program should demonstrate this:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import select
import subprocess

testapps={
    'slow': '''
import os
import time
os.write(1, 'aaa')
time.sleep(0.01)
os.write(2, 'bbb')
time.sleep(0.01)
os.write(1, 'ccc')
''',
    'fast': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbb')
os.write(1, 'ccc')
''',
    'fast2': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbbbbbbbbbbbbbb')
os.write(1, 'ccc')
'''
}

def readfds(fds, maxread):
    while True:
        fdsin, _, _ = select.select(fds,[],[])
        for fd in fdsin:
            s = os.read(fd, maxread)
            if len(s) == 0:
                fds.remove(fd)
                continue
            yield fd, s
        if fds == []:
            break

def readfromapp(app, rounds=10, maxread=1024):
    f=open('testapp.py', 'w')
    f.write(testapps[app])
    f.close()

    results={}
    for i in range(0, rounds):
        p = subprocess.Popen(['python', 'testapp.py'], stdout=subprocess.PIPE
                                                     , stderr=subprocess.PIPE)
        data=''
        for (fd, s) in readfds([p.stdout.fileno(), p.stderr.fileno()], maxread):
            data = data + s
        results[data] = results[data] + 1 if data in results else 1

    print 'running %i rounds %s with maxread=%i' % (rounds, app, maxread)
    results = sorted(results.items(), key=lambda (k,v): k, reverse=False)
    for data, count in results:
        print '%03i x %s' % (count, data)


print
print "=> if output is produced slowly this should work as whished"
print "   and should return: aaabbbccc"
readfromapp('slow',  rounds=100, maxread=1024)

print
print "=> now mostly aaacccbbb is returnd, not as it should be"
readfromapp('fast',  rounds=100, maxread=1024)

print
print "=> you could try to read data one by one, and return"
print "   e.g. a whole line only when LF is read"
print "   (b's should be finished before c's)"
readfromapp('fast',  rounds=100, maxread=1)

print
print "=> but even this won't work ..."
readfromapp('fast2', rounds=100, maxread=1)

and outputs something like this:

=> if output is produced slowly this should work as whished
   and should return: aaabbbccc
running 100 rounds slow with maxread=1024
100 x aaabbbccc

=> now mostly aaacccbbb is returnd, not as it should be
running 100 rounds fast with maxread=1024
006 x aaabbbccc
094 x aaacccbbb

=> you could try to read data one by one, and return
   e.g. a whole line only when LF is read
   (b's should be finished before c's)
running 100 rounds fast with maxread=1
003 x aaabbbccc
003 x aababcbcc
094 x abababccc

=> but even this won't work ...
running 100 rounds fast2 with maxread=1
003 x aaabbbbbbbbbbbbbbbccc
001 x aaacbcbcbbbbbbbbbbbbb
008 x aababcbcbcbbbbbbbbbbb
088 x abababcbcbcbbbbbbbbbb

I wrote something to do this a long time ago. I haven't yet ported it to Python 3, but it shouldn't be too difficult (patches accepted!)

If you run it standalone, you will see a lot of the different options. In any case, it allows you to distinguish stdout from stderr.

I know this question is very old, but this answer may help others who stumble upon this page in researching a solution for a similar situation, so I'm posting it anyway.

I've built a simple python snippet that will merge any number of pipes into a single one. Of course, as stated above, the order cannot be guaranteed, but this is as close as I think you can get in Python.

It spawns a thread for each of the pipes, reads them line by line and puts them into a Queue (which is FIFO). The main thread loops through the queue, yielding each line.

import threading, queue
def merge_pipes(**named_pipes):
    r'''
    Merges multiple pipes from subprocess.Popen (maybe other sources as well).
    The keyword argument keys will be used in the output to identify the source
    of the line.

    Example:
    p = subprocess.Popen(['some', 'call'],
                         stdin=subprocess.PIPE,
                         stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE)
    outputs = {'out': log.info, 'err': log.warn}
    for name, line in merge_pipes(out=p.stdout, err=p.stderr):
        outputs[name](line)

    This will output stdout to the info logger, and stderr to the warning logger
    '''

    # Constants. Could also be placed outside of the method. I just put them here
    # so the method is fully self-contained
    PIPE_OPENED=1
    PIPE_OUTPUT=2
    PIPE_CLOSED=3

    # Create a queue where the pipes will be read into
    output = queue.Queue()

    # This method is the run body for the threads that are instatiated below
    # This could be easily rewritten to be outside of the merge_pipes method,
    # but to make it fully self-contained I put it here
    def pipe_reader(name, pipe):
        r"""
        reads a single pipe into the queue
        """
        output.put( ( PIPE_OPENED, name, ) )
        try:
            for line in iter(pipe.readline,''):
                output.put( ( PIPE_OUTPUT, name, line.rstrip(), ) )
        finally:
            output.put( ( PIPE_CLOSED, name, ) )

    # Start a reader for each pipe
    for name, pipe in named_pipes.items():
        t=threading.Thread(target=pipe_reader, args=(name, pipe, ))
        t.daemon = True
        t.start()

    # Use a counter to determine how many pipes are left open.
    # If all are closed, we can return
    pipe_count = 0

    # Read the queue in order, blocking if there's no data
    for data in iter(output.get,''):
        code=data[0]
        if code == PIPE_OPENED:
            pipe_count += 1
        elif code == PIPE_CLOSED:
            pipe_count -= 1
        elif code == PIPE_OUTPUT:
            yield data[1:]
        if pipe_count == 0:
            return

This works for me (on windows): https://github.com/waszil/subpiper

from subpiper import subpiper

def my_stdout_callback(line: str):
    print(f'STDOUT: {line}')

def my_stderr_callback(line: str):
    print(f'STDERR: {line}')

my_additional_path_list = [r'c:\important_location']

retcode = subpiper(cmd='echo magic',
                   stdout_callback=my_stdout_callback,
                   stderr_callback=my_stderr_callback,
                   add_path_list=my_additional_path_list)

According to python's doc

Popen.stdout If the stdout argument was PIPE, this attribute is a file object that provides output from the child process. Otherwise, it is None.

Popen.stderr If the stderr argument was PIPE, this attribute is a file object that provides error output from the child process. Otherwise, it is None.

Below sample can do what you want

test.py

print "I'm stdout"

raise Exception("I'm Error")

printer.py

import subprocess

p = subprocess.Popen(['python', 'test.py'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)

print "Normal"
std_lines = p.stdout.readlines()
for line in std_lines:
    print line.rstrip()

print "Error"
stderr_lines = p.stderr.readlines()
for line in stderr_lines:
    print line.rstrip()

Output:

Normal
I'm stdout

Error
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    raise Exception("I'm Error")
Exception: I'm Error
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!