Multiprocessing : use tqdm to display a progress bar

后端 未结 8 497
情深已故
情深已故 2020-12-04 07:26

To make my code more \"pythonic\" and faster, I use \"multiprocessing\" and a map function to send it a) the function and b) the range of iterations.

The implanted s

8条回答
  •  南笙
    南笙 (楼主)
    2020-12-04 07:36

    Here is my take for when you need to get results back from your parallel executing functions. This function does a few things (there is another post of mine that explains it further) but the key point is that there is a tasks pending queue and a tasks completed queue. As workers are done with each task in the pending queue they add the results in the tasks completed queue. You can wrap the check to the tasks completed queue with the tqdm progress bar. I am not putting the implementation of the do_work() function here, it is not relevant, as the message here is to monitor the tasks completed queue and update the progress bar every time a result is in.

    def par_proc(job_list, num_cpus=None, verbose=False):
    
    # Get the number of cores
    if not num_cpus:
        num_cpus = psutil.cpu_count(logical=False)
    
    print('* Parallel processing')
    print('* Running on {} cores'.format(num_cpus))
    
    # Set-up the queues for sending and receiving data to/from the workers
    tasks_pending = mp.Queue()
    tasks_completed = mp.Queue()
    
    # Gather processes and results here
    processes = []
    results = []
    
    # Count tasks
    num_tasks = 0
    
    # Add the tasks to the queue
    for job in job_list:
        for task in job['tasks']:
            expanded_job = {}
            num_tasks = num_tasks + 1
            expanded_job.update({'func': pickle.dumps(job['func'])})
            expanded_job.update({'task': task})
            tasks_pending.put(expanded_job)
    
    # Set the number of workers here
    num_workers = min(num_cpus, num_tasks)
    
    # We need as many sentinels as there are worker processes so that ALL processes exit when there is no more
    # work left to be done.
    for c in range(num_workers):
        tasks_pending.put(SENTINEL)
    
    print('* Number of tasks: {}'.format(num_tasks))
    
    # Set-up and start the workers
    for c in range(num_workers):
        p = mp.Process(target=do_work, args=(tasks_pending, tasks_completed, verbose))
        p.name = 'worker' + str(c)
        processes.append(p)
        p.start()
    
    # Gather the results
    completed_tasks_counter = 0
    
    with tqdm(total=num_tasks) as bar:
        while completed_tasks_counter < num_tasks:
            results.append(tasks_completed.get())
            completed_tasks_counter = completed_tasks_counter + 1
            bar.update(completed_tasks_counter)
    
    for p in processes:
        p.join()
    
    return results
    

提交回复
热议问题