python 2.7 Popen: what does `close_fds` do?

问题

I have a web server in Python (2.7) that uses Popen to delegate some work to a child process:

url_arg = "http://localhost/index.html?someparam=somevalue"
call = ('phantomjs', 'some/phantom/script.js', url_arg)

imageB64data = tempfile.TemporaryFile()
errordata = tempfile.TemporaryFile()

p = Popen(call, stdout=imageB64data, stderr=errordata, stdin=PIPE)
p.communicate(input="")

I am seeing intermittent issues where after some number of these Popens have occurred (roughly 64), the process runs out of file descriptors and is unable to function -- it becomes completely unresponsive and all threads seem to block forever if they attempt to open any files or sockets.

(Possibly relevant: the phantomjs child process loads a URL calls back into the server that spawned it.)

Based on this Python bug report, I believe I need to set close_fds=True on all Popen calls from inside my server process in order to mitigate the leaking of file descriptors. However, I am unfamiliar with the machinery around exec-ing subprocesses and inheritance of file descriptors so much of the Popen documentation and the notes in the aforementioned bug report are unclear to me.

It sounds like it would actually close all open file descriptors (which includes active request sockets, log file handles, etc.) in my process before executing the subprocess. This sounds like it would be strictly better than leaking the sockets, but would still result in errors.

However, in practice, when I use close_fds=True during a web request, it seems to work fine and thus far I have been unable to construct a scenario where it actually closes any other request sockets, database requests, etc.

The docs state:

If close_fds is true, all file descriptors except 0, 1 and 2 will be closed before the child process is executed.

So my question is: is it "safe" and "correct" to pass close_fds=True to Popen in a multithreaded Python web server? Or should I expect this to have side effects if other requests are doing file/socket IO at the same time?

回答1:

I tried the following test with the subprocess32 backport of Python 3.2/3.3's subprocess:

import tempfile
import subprocess32 as subprocess

fp = open('test.txt', 'w')

fp.write("some stuff")

echoed = tempfile.TemporaryFile()
p = subprocess.Popen(("echo", "this", "stuff"), stdout=echoed, close_fds=True)
p.wait()
echoed.seek(0)

fp.write("whatevs")
fp.write(echoed.read())
fp.close()

and I got the expected result of some stuffwhatevsecho this stuff in test.txt.

So it appears that the meaning of close in close_fds does not mean that open files (sockets, etc.) in the parent process will be unusable after executing a child process.

Also worth noting: subprocess32 defaults close_fds=True on POSIX systems, AFAICT. This implies to me that it is not as dangerous as it sounds.

回答2:

I suspect that close_fds solves the problem of file descriptors leaking to subprocesses. Imagine opening a file, and then running some task using subprocess. Without close_fds, the file descriptor is copied to the subprocess, so even if the parent process closes the file, the file remains open due to the subprocess. Now, let's say we want to delete the directory with the file in another thread using shutil.rmtree. On a regular filesystem, this should not be an issue. The directory is just removed as expected. However, when the file resides on NFS, the following happens: First, Python will try to delete the file. Since the file is still in use, it gets renamed to .nfsXXX instead, where XXX is a long hexadecimal number. Next, Python will try to delete the directory, but that has become impossible because the .nfsXXX file still resides in it.

来源：https://stackoverflow.com/questions/44950592/python-2-7-popen-what-does-close-fds-do

标签

python

exec

file-descriptor