scikit-learn's GridSearchCV stops working when n_jobs>1

前端 未结 2 1690
渐次进展
渐次进展 2020-12-31 23:54

I have previously asked here come up with following lines of code:

parameters = [{\'weights\': [\'uniform\'], \'n_neighbors\': [5, 10, 20, 30, 40, 50, 60, 70         


        
2条回答
  •  遥遥无期
    2021-01-01 00:30

    libdispatch.dylib from Grand Central Dispatch is used internally by OSX's builtin implementation of BLAS called Accelerate when you do a numpy.dot calls. The GCD runtime does not work when programs call the POSIX fork syscall without using an exec syscall afterwards and therefore makes all Python programs that use the multiprocessing module prone to crash. sklearn's GridsearchCV uses the Python multiprocessing module for parallelization.

    Under Python 3.4 and later you can force Python multiprocessing to use the forkserver start method instead of the default fork mode to workaround this problem, for instance at the beginning of the main file of your program:

    if __name__ == "__main__":
        import multiprocessing as mp; mp.set_start_method('forkserver')
    

    Alternatively, you can rebuild numpy from source and make it link against ATLAS or OpenBLAS instead of OSX Accelerate. The numpy developers are working on binary distributions that include either ATLAS or OpenBLAS by default.

提交回复
热议问题