问题
Im trying to understand multiprocessing.Process class. I want to collect data asynchronously storing it somewhere. After having stored the data, it somehow gets lost. Here is my MWE:
from __future__ import print_function
import multiprocessing as mp
def append_test(tgt):
tgt.append(42)
print('Appended:', tgt)
l = []
p = mp.Process(target=lambda: append_test(l))
p.run()
print('l is', l)
p.start()
p.join()
print('l is', l)
If I'm running that snippet, I get
Appended: [42]
l is [42]
Appended: [42, 42]
l is [42]
As you can see, there is a difference between calling run and using start/join. It has nothing to do with the order (using run afterwards) - I've tried that. Can someone elaborate how the second 42 gets lost? It seems to be stored at some time? But at some other time its definetly not.
Just in case that could make any difference: I've tried python2.7 and python3.4, both with the exact same result described above.
Update: Apparently only start spawns a new process where run will be invoked afterwards. Then my actual problem translates to the following question: How do I pass l to the spawned process s.t. I can see the actual result?
Solution: The following example shows how to pass shared data safely to a Process:
from __future__ import print_function
import multiprocessing as mp
def append_test(tgt):
tgt.append(42)
print('Appended:', tgt)
m = mp.Manager()
l = m.list()
p = mp.Process(target=lambda: append_test(l))
p.start()
p.join()
print('l is', l)
Further reading: Multiprocessing Managers Documentation
回答1:
From Python: Essential Reference by Beazley:
p.run(): The method that runs when the process starts. By default, this invokes target that was passed to the Process constructor. ...
p.start(): Starts the process. This launches the subprocess that represents the process and invokes p.run() in that subprocess.
So, they are not meant to be doing the same thing. It looks to me like in this case, p.run() is being invoked for the ongoing process and p.start() calls p.run() in a new process with the original target that was passed to the constructor (in which l is [ ] still).
回答2:
Run executes the callable object that you target in multiprocessing. Start will call the run() method for the object.
From multiprocessing's documentation
run() Method representing the process’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
start() Start the process’s activity.
This must be called at most once per process object. It arranges for the object’s run() method to be invoked in a separate process.
来源:https://stackoverflow.com/questions/32703083/python-multiprocessing-process-start-with-local-variable