问题
I have this code :
import os
pid = os.fork()
if pid == 0:
os.environ['HOME'] = "rep1"
external_function()
else:
os.environ['HOME'] = "rep2"
external_function()
and this code :
from multiprocessing import Process, Pipe
def f(conn):
os.environ['HOME'] = "rep1"
external_function()
conn.send(some_data)
conn.close()
if __name__ == '__main__':
os.environ['HOME'] = "rep2"
external_function()
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print parent_conn.recv()
p.join()
The external_function
initializes an external programs by creating the necessary sub-directories in the directory found in the environment variable HOME
. This function does this work only once in each process.
With the first example, which uses os.fork()
, the directories are created as expected. But with second example, which uses multiprocessing
, only the directories in rep2
get created.
Why isn't the second example creating directories in both rep1
and rep2
?
回答1:
The answer you are looking for is in detail addressed here. There is also an explaination of differences of different OS.
On big issue is that the fork
system call does not exist on windows. Therefore, when running a windows OS you cannot use this method. multiprocessing
is a higher-level interface to execute a part of the currently running program. Therefore, it - as forking does - creates a copy of your process current state. So to say, it takes care for you about the forking of your program.
Therefore, if available you could consider fork()
a lower-level interface to forking a program and the multiprocess` library to be higher-level interface to forking.
Hope this helps.
回答2:
To answer your question directly, there must be some side effect of external_process
that makes it so that when the code is run in series, you get different results than if you run them at the same time. This is due to how you set up your code, and the lack of differences between os.fork
and multiprocessing.Process
in systems that os.fork
is supported.
The only real difference between the os.fork
and multiprocessing.Process
is portability and library overhead, since os.fork
is not supported in windows, and the multiprocessing
framework is included to make multiprocessing.Process
work. This is because os.fork
is called by multiprocessing.Process
, as this answer backs up.
The important distinction, then, is os.fork
copies everything in the current process using Unix's forking, which means at the time of forking both processes are the same with PID differences. In Window's, this is emulated by rerunning all the setup code before the if __name__ == '__main__':
, which is roughly the same as creating a subprocess using the subprocess
library.
For you, the code snippets you provide are doing fairly different things above, because you call external_function
in main before you open the new process in the second code clip, making the two processes run in series but in different processes. Also the pipe is unnecessary, as it emulates no functionality from the first code.
In Unix, the code snippets:
import os
pid = os.fork()
if pid == 0:
os.environ['HOME'] = "rep1"
external_function()
else:
os.environ['HOME'] = "rep2"
external_function()
and:
import os
from multiprocessing import Process
def f():
os.environ['HOME'] = "rep1"
external_function()
if __name__ == '__main__':
p = Process(target=f)
p.start()
os.environ['HOME'] = "rep2"
external_function()
p.join()
should do exactly the same thing, but with a little extra overhead from the included multiprocessing library.
Without further information, we can't figure out what the issue is. If you can provide code that demonstrates the issue, that would help us help you.
来源:https://stackoverflow.com/questions/24041935/difference-in-behavior-between-os-fork-and-multiprocessing-process