How do I stop all spiders and the engine immediately after a condition in a pipeline is met?
- 阅读更多 关于 How do I stop all spiders and the engine immediately after a condition in a pipeline is met?
We have a system written with scrapy to crawl a few websites. There are several spiders , and a few cascaded pipelines for all items passed by all crawlers. One of the pipeline components queries the google servers for geocoding addresses . Google imposes a limit of 2500 requests per day per IP address , and threatens to ban an IP address if it continues querying google even after google has responded with a warning message: 'OVER_QUERY_LIMIT'. Hence I want to know about any mechanism which I can invoke from within the pipeline that will completely and immediately stop all further crawling