Python joblib performance

寵の児 提交于 2021-02-10 16:01:40

问题


I need to run an embarrassingly parallel for loop. After a quick search, I found package joblib for python. I did a simple test as posted on the package's website. Here is the test

from math import sqrt
from joblib import Parallel, delayed
import multiprocessing 
%timeit [sqrt(i ** 2) for i in range(10)]
result: 3.89 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
num_cores = multiprocessing.cpu_count()
%timeit Parallel(n_jobs=num_cores)(delayed(sqrt)(i ** 2) for i in range(10))
result: 600 ms ± 40 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

If I understand the results correctly, using the joblib does not only increase the speed but make in it slower? Did I miss something here, Thank you


回答1:


Joblib creates new processes to run the functions you want to execute in parallel. However, creating processes can take some time (around 500ms), especially now that joblib uses spawn to create new processes (and not fork).

Because the function you want to run in parallel is very fast to run, the result of %timeit here mostly shows the overhead of process creation. If you choose a function that runs during a time that is not negligible compared to the time required to start new processes, you will see some improvements in performance:

Here is a sample you can run to test this:

import time
import joblib
from joblib import Parallel, delayed


def f(x):
    time.sleep(1)
    return x


def bench_joblib(n_jobs):
    start_time = time.time()
    Parallel(n_jobs=n_jobs)(delayed(f)(x) for x in range(4))
    print('running 4 times f using n_jobs = {} : {:.2f}s'.format(
        n_jobs, time.time()-start_time))


if __name__ == "__main__":
    bench_joblib(1)
    bench_joblib(4)

I got, using python 3.7 and joblib 0.12.5

running 4 times f using n_jobs = 1 : 4.01s
running 4 times f using n_jobs = 4 : 1.34s


来源:https://stackoverflow.com/questions/48349980/python-joblib-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!