问题
I am using Scrapy to crawl websites and extract data to a json file, but I've found that for some sites the crawler takes ages to crawl the complete website.
My question is: How can I minimize the time taken to crawl?
回答1:
Try tuning the CONCURRENT_ITEMS, CONCURRENT_REQUESTS, CONCURRENT_REQUESTS_PER_DOMAIN and other settings.
For full list of settings, see http://doc.scrapy.org/en/latest/topics/settings.html
来源:https://stackoverflow.com/questions/19109871/how-to-increase-scrapy-crawling-speed