Trying to understand Multiprocessing with main in python

核能气质少年 提交于 2019-12-10 10:04:18

问题


Using the code below I am getting strange output:

import  sys 
from  multiprocessing import Process
import time
from time import strftime

now =time.time()    
print time.strftime("%Y%m%d %H:%M:%S", time.localtime(now)) 

fr= [1,2,3]
for row in fr:
    print 3

print 1

def worker():
    print 'worker line'
    time.sleep(1)
    sys.exit(1)

def main():
    print 'start worker'
    Process(target=worker, args=()).start()
    print 'main line'

if __name__ == "__main__":
    start_time = time.time()
    main()
    end_time = time.time()
    duration = end_time - start_time
    print "Duration: %s" % duration

The output is:

20120324 20:35:53
3
3
3
1
start worker
main line
Duration: 0.0
20120324 20:35:53
3
3
3
1
worker line 

I was thinking I would get this:

20120324 20:35:53
3
3
3
1
start worker
worker line
main line
Duration: 1.0

Why is this run twice? Using python 2.7 on WinX64 :

20120324 20:35:53
3
3
3
1
worker line 

回答1:


the problem is basically because multiprocessing is really designed to run on a posix system, one with the fork(2) syscall. on those operating systems, the process can split into two, the child magically cloning the state from the parent, and both resume running in the same place, with the child now having a new process ID. In that situation, multiprocessing can arrange for some mechanism to ship state from parent to child as needed, with the certainty the child will already have most of the needed python state.

Windows does not have fork().

And so multiprocessing has to pick up the slack. This basically involves launching a brand new python interpreter running a multiprocessing child script. Almost immediately, the parent will ask the child to use something that is in the parent's state, and so the child will have to recreate that state from scratch, by importing your script into the child.

So anything that happens at import time in your script, will happen twice, once in the parent, and again in the child as it recreates the python environment needed to serve the parent.




回答2:


This is what I get when I run your code on Linux using Python 2.7.3:

20120324 23:05:49
3
3
3
1
start worker
main line
Duration: 0.0045280456543
worker line

I don't know why yours runs twice, but I can tell you why it doesn't return the expected duration time or print in the "correct" order.

When you start a process using multiprocessing, the launch is asynchronous. That is, the .start() function returns immediately in the parent process, so that the parent process can continue to work and do other things (like launch more processes) while the child process does its own thing in the background. If you wanted to block the parent process from proceeding until the child process ends, you should use the .join() function. Like so:

def main():
    print 'start worker'
    p = Process(target=worker, args=())
    p.start()
    p.join()
    print 'main line'


来源:https://stackoverflow.com/questions/9857006/trying-to-understand-multiprocessing-with-main-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!