Concurrent asynchronous processes with Python, Flask and Celery

前端 未结 3 1108
故里飘歌
故里飘歌 2020-12-15 00:48

I am working on a small but computationally-intensive Python app. The computationally-intensive work can be broken into several pieces that can be executed concurrently. I a

相关标签:
3条回答
  • 2020-12-15 01:11

    According to the documentation for result.get(), it waits until the result is ready before returning, so normally it is in fact blocking. However, since you have timeout=1, the call to get() will raise a TimeoutError if the task takes longer than 1 second to complete.

    By default, Celery workers start with a concurrency level set equal to the number of CPUs available. The concurrency level seems to determine the number of threads that can be used to process tasks. So, with a concurrency level >= 3, it seems like the Celery worker should be able to process that many tasks concurrently (perhaps someone with greater Celery expertise can verify this?).

    0 讨论(0)
  • 2020-12-15 01:14

    You should change your code so the workers can work in parallel:

    @myapp.route('/foo')
    def bar():
        # start tasks
        task_1 = a_long_process.delay(x, y)
        task_2 = another_long_process.delay(x, y)
        task_3 = yet_another_long_process.delay(x, y)
        # fetch results
        try:
            task_1_result = task_1.get(timeout=1)
            task_2_result = task_2.get(timeout=1)
            task_3_result = task_3.get(timeout=1)
        except TimeoutError:
            # Handle this or don't specify a timeout.
            raise
        # combine results
        return task_1 + task_2 + task_3
    

    This code will block until all results are available (or the timeout is reached).

    Will the Flask app be blocked while the processes are executing?

    This code will only block one worker of your WSGI container. Wether the entire site is unresponsive depends on the WSGI container you are using. (e.g. Apache + mod_wsgi, uWSGI, gunicorn, etc.) Most WSGI containers spawn multiple workers so only one worker will be blocked while your code waits for the task results.

    For this kind of application I would recommend using gevent which spawns a separate greenlet for every request and is very lightweight.

    0 讨论(0)
  • 2020-12-15 01:20

    Use the Group feature of celery canvas:

    The group primitive is a signature that takes a list of tasks that should be applied in parallel.

    Here is the example provided in the documentation:

    from celery import group
    from proj.tasks import add
    
    g = group(add.s(2, 2), add.s(4, 4))
    res = g()
    res.get()
    

    Which outputs [4, 8].

    0 讨论(0)
提交回复
热议问题