scrapy is slow (60 pages/min)

时间秒杀一切 提交于 2019-12-11 05:57:19

问题


my crawler seems to be working really slow, not sure why. ill try to explain how it works.

keep in mind I use inline requests

first i have 31 different starting URLs. each URL is a category in amazon. settings:

USER_AGENT = "Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201" 

ROBOTSTXT_OBEY = False

CONCURRENT_REQUESTS = 2048

DOWNLOAD_DELAY = 1

CONCURRENT_REQUESTS_PER_DOMAIN = 2048

on each URL i do for on all the items in that page(16 items).

on each item i send a request to book scouter sell API and check the sell price.

after that i send a request to book scouter buy API and check the buy price(its a different link, so 2 separate requests, one for buy one for sell).

after that i yield the ISBN, buy price and sell price.

now i check if the next page URL is string, if so crawl next page.

am i doing something wrong or that's the speed to be expected?

来源:https://stackoverflow.com/questions/46568502/scrapy-is-slow-60-pages-min

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!