问题
I'm using Joblib to cache results of a computationally expensive function in my python script. The function's input arguments and return values are numpy arrays. The cache works fine for a single run of my python script. Now I want to spawn multiple runs of my python script in parallel for sweeping some parameter in an experiment. (The definition of the function remains same across all the runs).
Is there a way to share the joblib cache among multiple python scripts running in parallel? This would save a lot of function evaluations which are repeated across different runs but do not repeat within a single run. I couldn't find if this is possible in Joblib's documentation
回答1:
Specify a common, fixed cachedir
and decorate the function that you want to cache using
from joblib import Memory
mem = Memory(cachedir=cachedir)
@mem.cache
def f(arguments):
"""do things"""
pass
or simply
def g(arguments):
pass
cached_g = mem.cache(g)
Then, even if you are working across processes, across machines, if all instances of your program have access to cachedir
, then common function calls can be cached there transparently.
来源:https://stackoverflow.com/questions/25033631/multiple-processes-sharing-a-single-joblib-cache