Throttling requests with multiple proxies

前端 未结 1 635
温柔的废话
温柔的废话 2020-12-19 22:37

I\'m currently assigning random proxies to requests via a custom middleware. I\'d like to key download throttling to the specific proxy that the request is using, but as far

1条回答
  •  失恋的感觉
    2020-12-19 22:47

    As recommended on the Scrapy mailing list, there is a special request meta variable that the Autothrottle middleware obeys, called download_slot - this allows for programmatic grouping/throttling of requests.

    In my custom proxy middleware:

    self.proxies = get_proxies() #list of proxies
    proxy_address = random.choice(self.proxies)
    request.meta['proxy'] = proxy_address
    request.meta['download_slot'] = hash(proxy_address) % MAX_CONCURRENT_REQUESTS
    

    I use the hash function as a cheap way to bucket the requests by an externally defined limit on requests.

    0 讨论(0)
提交回复
热议问题