Very large input and piping using subprocess.Popen

前端 未结 5 717
北恋
北恋 2020-12-25 15:40

I have pretty simple problem. I have a large file that goes through three steps, a decoding step using an external program, some processing in python, and then recoding usi

5条回答
  •  粉色の甜心
    2020-12-25 16:09

    However, all the data are buffered to memory ...

    Are you using subprocess.Popen.communicate()? By design, this function will wait for the process to finish, all the while accumulating the data in a buffer, and then return it to you. As you've pointed out, this is problematic if dealing with very large files.

    If you want to process the data while it is generated, you will need to write a loop using the poll() and .stdout.read() methods, then write that output to another socket/file/etc.

    Do be sure to notice the warnings in the documentation against doing this as it is easy to result in a deadlock (the parent process waits for the child process to generate data, who is in turn waiting for the parent process to empty the pipe buffer).

提交回复
热议问题