Django long running asynchronous tasks with threads/processing

杀马特。学长 韩版系。学妹 提交于 2019-11-27 19:20:30

os.fork

A fork will clone the parent process, which in this case, is your Django stack. Since you're merely wanting to run a separate python script, this seems like an unnecessary amount of bloat.

subprocess

Using subprocess is expected to be interactive. In other words, while you can use this to effectively spawn off a process, it's expected that at some point you'll terminate it when finished. It's possible Python might clean up for you if you leave one running, but my guess would be that this will actually result in a memory leak.

threads

Threads are defined units of logic. They start when their run() method is called, and terminate when the run() method's execution ends. This makes them well suited to creating a branch of logic that will run outside the current scope. However, as you mentioned, they are subject to the Global Interpreter Lock.

multiprocessing

This is basically threads on steroids. It has the benefits of a thread, but is not subject to the Global Interpreter Lock, and can take advantage of multi-core architectures. However, they are more complicated to work with as a result.

So, your choices really come down to threads or multiprocessing. If you can get by with a thread and it makes sense for your application, go with a thread. Otherwise, use multiprocessing.

I have found that using uWSGI Decorators is quite simpler than using Celery if you need just run some long task in background. Think Celery is best solution for serious heavy project, and it's overhead for doing something simple.

For start using uWSGI Decorators you just need to update your uWSGI config with

<spooler-processes>1</spooler-processes>
<spooler>/here/the/path/to/dir</spooler>

write code like:

@spoolraw
def long_task(arguments):
    try:
        doing something with arguments['myarg'])
    except Exception as e:
        ...something...
    return uwsgi.SPOOL_OK

def myView(request)
    long_task.spool({'myarg': str(someVar)})
    return render_to_response('done.html')

Than when you start view in uWSGI log appears:

[spooler] written 208 bytes to file /here/the/path/to/dir/uwsgi_spoolfile_on_hostname_31139_2_0_1359694428_441414

and when task finished:

[spooler /here/the/path/to/dir pid: 31138] done with task uwsgi_spoolfile_on_hostname_31139_2_0_1359694428_441414 after 78 seconds

There is strange(for me) restrictions:

    - spool can receive as argument only dictionary of strings, look like because it's serialize in file as strings.
    - spool should be created on start up so "spooled" code it should be contained in separate file which should be defined in uWSGI config as <import>pyFileWithSpooledCode</import>

For the question:

Will the spawned long thread block wsgi from doing something else for another request?!

the answer is no.

You still have to be careful creating background threads from a request though in case you simply create huge numbers of them and clog up the whole process. You really need a task queueing system even if you are doing stuff in process.

In respect of doing a fork or exec from web process, especially from Apache that is generally not a good idea as Apache may impose odd conditions on the environment of the sub process created which could technically interfere with its operation.

Using a system like Celery is still probably the best solution.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!