How to capture inputs and outputs of a child process?

让人想犯罪 __ 提交于 2020-07-08 04:58:40

问题


I'm trying to make a program which takes an executable name as an argument, runs the executable and reports the inputs and outputs for that run. For example consider a child program named "circle". The following would be desired run for my program:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input',  '10\n'), ('output', 'Area: 314.158997\n')]

I decided to use pexpect module for this job. It has a method called interact which lets the user interact with the child program as seen above. It also takes 2 optional parameters: output_filter and input_filter. From the documentation:

The output_filter will be passed all the output from the child process. The input_filter will be passed all the keyboard input from the user.

So this is the code I wrote:

capture_io.py

import sys
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


def capture_io(argv):
    _stdios.clear()
    child = pexpect.spawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios


if __name__ == '__main__':
    stdios_of_child = capture_io(sys.argv[1:])
    print(stdios_of_child)

circle.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    float radius, area;

    printf("Enter radius of circle: ");
    scanf("%f", &radius);

    if (radius < 0) {
        fprintf(stderr, "Negative radius values are not allowed.\n");
        exit(1);
    }

    area = 3.14159 * radius * radius;
    printf("Area: %f\n", area);
    return 0;
}

Which produces the following output:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '1'), ('output', '1'), ('input', '0'), ('output', '0'), ('input', '\r'), ('output', '\r\n'), ('output', 'Area: 314.158997\r\n')]

As you can observe from the output, the input_filter function is run on every key press which leaves me back with a mess. Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

Or more generally, what would be the best way to achieve my goal (with or without pexpect)?


回答1:


I don't think you'll be able to do that easily, however, I think this should work for you:

output_buffer=''
def read(data):
    output_buffer+=data
    if data == '\r':
         _stdios.append(("output", output_buffer.decode("utf8")))
         output_buffer = ''
    return data




回答2:


When I started to write a helper, I realized that the main problem is that the input should be logged line buffered, so the backspace and other editing is done before the input reaches the program, but the output should be unbuffered in order to log the prompt which is not terminated by a new line.

To capture the output for the purpose of logging, a pipe is needed, but that automatically turns on line buffering. It is known that a pseudoterminal solves the problem (the expect module is built around a pseudoterminal), but a terminal has both the input and the output and we want to unbuffer only the output.

Fortunately there is the stdbuf utility. On Linux it alters the C library functions of dynamically linked executables. Not universally usable.

I have modified a Python bidirectional copy program to log the data it copies. Combined with the stdbuf it produces the desired output.

import select
import os

STDIN = 0
STDOUT = 1

BUFSIZE = 4096

def main(cmd):
    ipipe_r, ipipe_w = os.pipe()
    opipe_r, opipe_w = os.pipe()
    if os.fork():
        # parent
        os.close(ipipe_r)
        os.close(opipe_w)
        fdlist_r = [STDIN, opipe_r]
        while True:
            ready_r, _, _ = select.select(fdlist_r, [], []) 
            if STDIN in ready_r:
                # STDIN -> program
                data = os.read(STDIN, BUFSIZE)
                if data:
                    yield('in', data)   # optional: convert to str
                    os.write(ipipe_w, data)
                else:
                    # send EOF
                    fdlist_r.remove(STDIN)
                    os.close(ipipe_w)
            if opipe_r in ready_r:
                # program -> STDOUT
                data = os.read(opipe_r, BUFSIZE)
                if not data:
                    # got EOF
                    break
                yield('out', data)
                os.write(STDOUT, data)
        os.wait()
    else:
        # child
        os.close(ipipe_w)
        os.close(opipe_r)
        os.dup2(ipipe_r, STDIN)
        os.dup2(opipe_w, STDOUT)
        os.execlp(*cmd)
        # not reached
        os._exit(127)

if __name__ == '__main__':
    log = list(main(['stdbuf', 'stdbuf', '-o0', './circle']))
    print(log)

It prints:

[('out', b'Enter radius of circle: '), ('in', b'12\n'), ('out', b'Area: 452.388947\n')]



回答3:


Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

Yes, you can do it by inheriting from pexpect.spawn and overwriting the interact method. I will come to that soon.

As VPfB pointed out in their answer, you can't use a pipe and I think it's worth to mentioning that this issue is also addressed in the pexpect's documentation, in "Why not just use a pipe (popen())?" section.

You can use Python's pty module to solve this problem naturally. Unfortunately, I don't remember where did I found the code so I can't give credits but here it is:

import os
import pty
import sys
import termios
import tty

_stdios = []

def _read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("output", data.decode("utf8")))
    return data


def _stdin_read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("input", data.decode("utf8")))
    return data


def _spawn(argv):
    pid, master_fd = pty.fork()
    if pid == pty.CHILD:
        os.execlp(argv[0], *argv)
    try:
        mode = tty.tcgetattr(pty.STDIN_FILENO)
        tty.setraw(master_fd, termios.TCSANOW)
        restore = True
    except tty.error:
        restore = False

    try:
        pty._copy(master_fd, _read, _stdin_read)
    except OSError:
        if restore:
            tty.tcsetattr(pty.STDIN_FILENO, tty.TCSAFLUSH, mode)

    os.close(master_fd)
    return os.waitpid(pid, 0)[1]


def capture_io_and_return_code(argv):
    _stdios.clear()
    return_code = _spawn(argv)
    return _stdios, return_code >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)

And if you look at the source code of interact, you can see that it's almost the same thing that we did in our implementation. The only difference is this line: tty.setraw(self.STDIN_FILENO). So you can inherit from pexpect.spawn and implement a custom interact method:

import sys
import termios
import tty
import pexpect

_stdios = []

def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


class CustomSpawn(pexpect.spawn):
    def interact(self, escape_character=chr(29),
                 input_filter=None, output_filter=None):
        self.write_to_stdout(self.buffer)
        self.stdout.flush()
        self._buffer = self.buffer_type()
        mode = tty.tcgetattr(self.STDIN_FILENO)
        tty.setraw(self.child_fd, termios.TCSANOW)  # Changed this line according to our needs
        if escape_character is not None and pexpect.PY3:
            escape_character = escape_character.encode('latin-1')
        try:
            # We are accessing a hidden method which is considered bad thing to do.
            self._spawn__interact_copy(escape_character, input_filter, output_filter)
        finally:
            tty.tcsetattr(self.STDIN_FILENO, tty.TCSAFLUSH, mode)


def capture_io_and_return_code(argv):
    _stdios.clear()
    child = CustomSpawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios, child.status >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)

The drawback of the second solution is that we are accessing a hidden method of parent class and unavoidably repeating 15 lines of code just to change one line. So you may choose the pure pty solution.



来源:https://stackoverflow.com/questions/62288531/how-to-capture-inputs-and-outputs-of-a-child-process

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!