Is there a simple way to track the overall progress of a joblib.Parallel execution?
I have a long-running execution composed of thousands of jobs, which I want to tr
Expanding on dano's answer for the newest version of the joblib library. There were a couple of changes to the internal implementation.
from joblib import Parallel, delayed
from collections import defaultdict
# patch joblib progress callback
class BatchCompletionCallBack(object):
completed = defaultdict(int)
def __init__(self, time, index, parallel):
self.index = index
self.parallel = parallel
def __call__(self, index):
BatchCompletionCallBack.completed[self.parallel] += 1
print("done with {}".format(BatchCompletionCallBack.completed[self.parallel]))
if self.parallel._original_iterator is not None:
self.parallel.dispatch_next()
import joblib.parallel
joblib.parallel.BatchCompletionCallBack = BatchCompletionCallBack