Force spider to stop in scrapy

前端 未结 3 1510
花落未央
花落未央 2020-12-17 06:56

I have 20 spiders in one project, each spider has different task and URL to crawl ( but data are similar and I\'m using shared items.py and pipelines.py

相关标签:
3条回答
  • 2020-12-17 06:57

    why not just use this

    # with some condition
    sys.exit("Closing the spider")
    
    0 讨论(0)
  • 2020-12-17 07:03

    OK then you can use CloseSpider exception.

    from scrapy.exceptions import CloseSpider
    # condition
    raise CloseSpider("message")
    
    0 讨论(0)
  • 2020-12-17 07:04

    If you want to stop a spider from a pipeline, you can call the close_spider() function of the engine.

    class MongoDBPipeline(object):
    
        def process_item(self, item, spider):
            spider.crawler.engine.close_spider(self, reason='finished')
    
    0 讨论(0)
提交回复
热议问题