Chain a celery task's results into a distributed group

做~自己de王妃 提交于 2019-12-04 05:21:34

celery canvas provides chunks to split a task into chunks. Unfortunately, this won't work with primitives like chain, group.

You can use celery signals to prevent issues with dmap/clone.

ch = chain(
    download_sub_page.s(),
    process_sub_page.s(),
    upload_sub_page.s(),
)

@task_success.connect(sender='get_links_from_website')
def task_success_handler(sender=None, headers=None, body=None, **kwargs):
    result = kwargs['result']    
    header = [ch(i) for i in result]
    callback = mark_website_done.si()
    chord(header)(callback)

Create a chain for processing pages and hook the last task to it using a chord. This function gets executed whenever get_links_from_website runs succcessfully.

Depending on the time taken by chain, you can also save results of get_links_from_website somewhere. Then iterate over a batch of them to queue up chains and with the last batch, you can hook a callback to last task.

This is a bit hacky but we're using deepcopy to clone the callback, this fixes the bug with Signature's shallow copy

def dmap(it, callback, final=None):
    # Map a callback over an iterator and return as a group
    callback = subtask(callback)

    run_in_parallel = group(subtask(copy.deepcopy(dict(callback))).clone([arg, ]) for arg in it)

    if len(run_in_parallel.tasks) == 0:
        return []

    if final:
        return chord(run_in_parallel)(final)

    return run_in_parallel.delay()

Note that this will only work for one nesting level (i.e. callback is a chain/group/chord) but will not work for deeply nested callbacks

For deeply nested callback graphs we use this hack which is a bit slower but works flawlessly

# Hack to completely clone a signature with possibly complex subtasks (chains, chords, etc...)
run_in_parallel = group(pickle.loads(pickle.dumps(callback)).clone([arg, ]) for arg in it)

And for the size of the groups you can always split the iterator to chunks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!