Parallelise python loop with numpy arrays and shared-memory

前端 未结 3 2082
Happy的楠姐
Happy的楠姐 2020-12-23 12:31

I am aware of several questions and answers on this topic, but haven\'t found a satisfactory answer to this particular problem:

What is the easiest way to do a simpl

3条回答
  •  别那么骄傲
    2020-12-23 13:05

    With Cython parallel support:

    # asd.pyx
    from cython.parallel cimport prange
    
    import numpy as np
    
    def foo():
        cdef int i, j, n
    
        x = np.zeros((200, 2000), float)
    
        n = x.shape[0]
        for i in prange(n, nogil=True):
            with gil:
                for j in range(100):
                    x[i,:] = np.cos(x[i,:])
    
        return x
    

    On a 2-core machine:

    $ cython asd.pyx
    $ gcc -fPIC -fopenmp -shared -o asd.so asd.c -I/usr/include/python2.7
    $ export OMP_NUM_THREADS=1
    $ time python -c 'import asd; asd.foo()'
    real    0m1.548s
    user    0m1.442s
    sys 0m0.061s
    
    $ export OMP_NUM_THREADS=2
    $ time python -c 'import asd; asd.foo()'
    real    0m0.602s
    user    0m0.826s
    sys 0m0.075s
    

    This runs fine in parallel, since np.cos (like other ufuncs) releases the GIL.

    If you want to use this interactively:

    # asd.pyxbdl
    def make_ext(modname, pyxfilename):
        from distutils.extension import Extension
        return Extension(name=modname,
                         sources=[pyxfilename],
                         extra_link_args=['-fopenmp'],
                         extra_compile_args=['-fopenmp'])
    

    and (remove asd.so and asd.c first):

    >>> import pyximport
    >>> pyximport.install(reload_support=True)
    >>> import asd
    >>> q1 = asd.foo()
    # Go to an editor and change asd.pyx
    >>> reload(asd)
    >>> q2 = asd.foo()
    

    So yes, in some cases you can parallelize just by using threads. OpenMP is just a fancy wrapper for threading, and Cython is therefore only needed here for the easier syntax. Without Cython, you can use the threading module --- works similarly as multiprocessing (and probably more robustly), but you don't need to do anything special to declare arrays as shared memory.

    However, not all operations release the GIL, so YMMV for the performance.

    ***
    

    And another possibly useful link scraped from other Stackoverflow answers --- another interface to multiprocessing: http://packages.python.org/joblib/parallel.html

提交回复
热议问题