Running Multiple Scrapy Spiders (the easy way) Python

前端 未结 3 936
清歌不尽
清歌不尽 2020-12-28 09:11

Scrapy is pretty cool, however I found the documentation to very bare bones, and some simple questions were tough to answer. After putting together various techniques from v

3条回答
  •  旧时难觅i
    2020-12-28 09:55

    Your method makes it procedural which makes it slow, against Scrapy's main principal. To make it asynchronous as always, you can try using CrawlerProcess

    from scrapy.utils.project import get_project_settings
    from scrapy.crawler import CrawlerProcess
    
    from myproject.spiders import spider1, spider2
    
    1Spider = spider1.1Spider()
    2Spider = spider2.2Spider()
    process = CrawlerProcess(get_project_settings())
    process.crawl(1Spider)
    process.crawl(2Spider)
    process.start()
    

    If you want to see the full log of the crawl, set LOG_FILE in your settings.py.

    LOG_FILE = "logs/mylog.log"
    

提交回复
热议问题