Python: Efficient workaround for multiprocessing a function that is a data member of a class, from within that class

前端 未结 3 1191
春和景丽
春和景丽 2020-12-09 06:36

I\'m aware of various discussions of limitations of the multiprocessing module when dealing with functions that are data members of a class (due to Pickling problems).

3条回答
  •  不知归路
    2020-12-09 07:02

    Steven Bethard has posted a way to allow methods to be pickled/unpickled. You could use it like this:

    import multiprocessing as mp
    import copy_reg
    import types
    
    def _pickle_method(method):
        # Author: Steven Bethard
        # http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods
        func_name = method.im_func.__name__
        obj = method.im_self
        cls = method.im_class
        cls_name = ''
        if func_name.startswith('__') and not func_name.endswith('__'):
            cls_name = cls.__name__.lstrip('_')
        if cls_name:
            func_name = '_' + cls_name + func_name
        return _unpickle_method, (func_name, obj, cls)
    
    def _unpickle_method(func_name, obj, cls):
        # Author: Steven Bethard
        # http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods
        for cls in cls.mro():
            try:
                func = cls.__dict__[func_name]
            except KeyError:
                pass
            else:
                break
        return func.__get__(obj, cls)
    
    # This call to copy_reg.pickle allows you to pass methods as the first arg to
    # mp.Pool methods. If you comment out this line, `pool.map(self.foo, ...)` results in
    # PicklingError: Can't pickle : attribute lookup
    # __builtin__.instancemethod failed
    
    copy_reg.pickle(types.MethodType, _pickle_method, _unpickle_method)
    
    class MyClass(object):
    
        def __init__(self):
            self.my_args = [1,2,3,4]
            self.output  = {}
    
        def my_single_function(self, arg):
            return arg**2
    
        def my_parallelized_function(self):
            # Use map or map_async to map my_single_function onto the
            # list of self.my_args, and append the return values into
            # self.output, using each arg in my_args as the key.
    
            # The result should make self.output become
            # {1:1, 2:4, 3:9, 4:16}
            self.output = dict(zip(self.my_args,
                                   pool.map(self.my_single_function, self.my_args)))
    

    Then

    pool = mp.Pool()   
    foo = MyClass()
    foo.my_parallelized_function()
    

    yields

    print foo.output
    # {1: 1, 2: 4, 3: 9, 4: 16}
    

提交回复
热议问题