Joblib Parallel + Cython hanging forever

只愿长相守 提交于 2021-02-10 15:44:12

问题


I have a very weird problem while creating a Python extension with Cython that uses joblib.Parallel.

The following code works as expected:

from joblib import Parallel, delayed
from math import sqrt

print(Parallel(n_jobs=4)(delayed(sqrt)(x) for x in range(4)))

The following code hangs forever:

from joblib import Parallel, delayed

def mult(x):
    return x*3

print(Parallel(n_jobs=4)(delayed(mult)(x) for x in range(4)))

I have no clues why. I use the following setup.py:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("file.pyx")
)

I create the extension with python setup.py build_ext --inplace and I import it as import file.

Thank you!


回答1:


After some time I finally found the solution: there is a deadlock while pickling the program status to send it to different CPUs. I am not totally sure of the cause, but inspecting the source code, it looked as if new threads are generated to pickle the objects and these threads are the ones to cause the deadlock.

Once the processes are generated they run normally: manually creating the processes through the library multiprocessing fixes the problem.

Alternatively, you can use multiprocessing.Pool manually specifying the start_method:

from multiprocessing import get_context()

if __name__ == '__main__':
    with get_context("spawn").Pool() as pool:
        ...

You can freely choose spawn or forkserver as start_method.

Visit this page if you want more information.



来源:https://stackoverflow.com/questions/53497078/joblib-parallel-cython-hanging-forever

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!