instance methods with multiprocessing.Pool

浪尽此生 提交于 2020-06-27 06:14:09

问题


I've been playing around with a Pool object while using an instance method as the func argument. It's been a bit surprising with regards to instance state. It seems like the instance gets reset on every chunk. E.g.:

import multiprocessing as mp
import logging

class Worker(object):
    def __init__(self):
        self.consumed = set()

    def consume(self, i):
        if i not in self.consumed:
            logging.info(i)
            self.consumed.add(i)

if __name__ == '__main__':
    n = 1
    logging.basicConfig(level='INFO', format='%(process)d: %(message)s')
    worker = Worker()

    with mp.Pool(processes=2) as pool:
        pool.map(worker.consume, [1] * 100, chunksize=n)

If n is set to 1, then 1 gets logged every time. if n is set to 20, it's logged 5 times, etc. What is the reason for this, and is there any way around it? I also wanted to use the initializer pool argument with an instance method but hit similar issues.


回答1:


The instance method worker.consume is passed to the worker processes on a queue. To accomplish this, it must be pickled. For every job, the same pickle string is received, but a new instance is created when that string is unpickled. You can see the gist of what's going on here, without any multiprocessing:

In [1]: import pickle

In [2]: class Thing:
   ...:     def __init__(self):
   ...:         self.called = 0
   ...:     def whoami(self):
   ...:         self.called += 1
   ...:         print("{} called {} times".format(self, self.called))

In [3]: pickled = pickle.dumps(Thing().whoami)

In [4]: pickle.loads(pickled)()
<__main__.Thing object at 0x10a636898> called 1 times

In [5]: pickle.loads(pickled)()
<__main__.Thing object at 0x10a6c6550> called 1 times

In [6]: pickle.loads(pickled)()
<__main__.Thing object at 0x10a6bd940> called 1 times

The id of each Thing instance is different, and each has its own called attribute.



来源:https://stackoverflow.com/questions/48332391/instance-methods-with-multiprocessing-pool

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!