Can't pickle Pyparsing expression with setParseAction() method. Needed for multiprocessing

我的梦境 提交于 2020-05-29 09:44:53

问题


My original issue is that I am trying to do the following:

def submit_decoder_process(decoder, input_line):
    decoder.process_line(input_line)
    return decoder

self.pool = Pool(processes=num_of_processes)
self.pool.apply_async(submit_decoder_process, [decoder, input_line]).get()

decoder is a bit involved to describe here, but the important thing is that decoder is an object that is initialized with PyParsing expression that calls setParseAction(). This fails pickle that multiprocessing uses and this in turn fails the above code.

Now, here is the pickle/PyParsing problem that I have isolated and simplified. The following code yields an error message due to pickle failure.

import pickle
from pyparsing import *

def my_pa_func():
    pass

pickle.dumps(Word(nums).setParseAction(my_pa_func))

Error message:

pickle.PicklingError: Can't pickle <function wrapper at 0x00000000026534A8>: it's not found as pyparsing.wrapper

Now If you remove the call .setParseAction(my_pa_func), it will work with no problems:

pickle.dumps(Word(nums))

How can I get around it? Multiprocesing uses pickle, so I can't avoid it, I guess. The pathos package that is supposedly uses dill is not mature enough, at least, I am having problems installing it on my Windows-64bit. I am really scratching my head here.


回答1:


OK, here is the solution inspired by rocksportrocker: Python multiprocessing pickling error

The idea is to dill the object that can't be pickled while passing it back and forth between processes and then "undill" it after it has been passed:

from multiprocessing import Pool
import dill

def submit_decoder_process(decoder_dill, input_line):
    decoder = dill.loads(decoder_dill)  # undill after it was passed to a pool process
    decoder.process_line(input_line)
    return dill.dumps(decoder)  # dill before passing back to parent process

self.pool = Pool(processes=num_of_processes)

# Dill before sending to a pool process
decoder_processed = dill.loads(self.pool.apply_async(submit_decoder_process, [dill.dumps(decoder), input_line]).get())



回答2:


https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled

The multiprocessing.Pool uses the Pickle's protocol to serialize the function and module names (in your example setParseAction and pyparse) which are delivered through the Pipe to the child process.

The child process, once receives them, it imports the module and try to call the function. The problem is that what you're passing is not a function but a method. To resolve it, the Pickle protocol should be clever enough to build 'Word' object with the 'user' parameter and then call the setParseAction method. As handling these cases is too complicated, the Pickle protocol prevents you to serialize non top level functions.

To solve your issue either you instruct the Pickle's module on how to serialize the setParseAction method (https://docs.python.org/2/library/pickle.html#pickle-protocol) or you refactor your code in a way that what's passed to the Pool.apply_async is serializable.

What if you pass the Word object to the child process and you let it call the Word().setParseAction()?




回答3:


I'd suggest pathos.multiprocessing, as you mention. Of course, I'm the pathos author, so I guess that's not a surprise. It appears that there might be a distutils bug that you are running into, as referenced here: https://github.com/uqfoundation/pathos/issues/49.

Your solution using dill is a good workaround. You also might be able to forgo installing the entire pathos package, and just install the pathos fork of the multiprocessing package (which uses dill instead of pickle). You can find it here: http://dev.danse.us/packages or here: https://github.com/uqfoundation/pathos/tree/master/external,



来源:https://stackoverflow.com/questions/27883574/cant-pickle-pyparsing-expression-with-setparseaction-method-needed-for-multi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!