is Scrapy single-threaded or multi-threaded?

邮差的信 提交于 2019-12-03 23:22:17

Scrapy is single-threaded, except the interactive shell and some tests, see source.

It's built on top of Twisted, which is single-threaded too, and makes use of it's own asynchronous concurrency capabilities, such as twisted.internet.interfaces.IReactorThreads.callFromThread, see source.

Scrapy does most of it's work synchronously. However, the handling of requests is done asynchronously.

I suggest this page if you haven't already seen it.

http://doc.scrapy.org/en/latest/topics/architecture.html

edit: I realize now the question was about threading and not necessarily whether it's asynchronous or not. That link would still be a good read though :)

regarding your question about CONCURRENT_REQUESTS. This setting changes the number of requests that twisted will defer at once. Once that many requests have been started it will wait for some of them to finish before starting more.

Scrapy is single-threaded framework, we cannot use multiple threads within a spider at the same time. However, we can create multiple spiders and piplines at the same time to make the process concurrent. Scrapy does not support multi-threading because it is built on Twisted, which is an Asynchronous http protocol framework.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!