This is not working anymore, scrapy\'s API has changed.
Now the documentation feature a way to \"Run Scrapy from a script\" but I get the ReactorNotRestartable
The Twisted reactor cannot be restarted, so once one spider finishes running and crawler stops the reactor implicitly, that worker is useless.
As posted in the answers to that other question, all you need to do is kill the worker which ran your spider and replace it with a fresh one, which prevents the reactor from being started and stopped more than once. To do this, just set:
CELERYD_MAX_TASKS_PER_CHILD = 1
The downside is that you're not really using the Twisted reactor to its full potential and wasting resources running multiple reactors, as one reactor can run multiple spiders at once in a single process. A better approach is to run one reactor per worker (or even one reactor globally) and don't let crawler touch it.
I'm working on this for a very similar project, so I'll update this post if I make any progress.