问题
I am new in scrapy. I found that for use http proxy but I want to use http and https proxy together because when I crawl the links there has http and https links. How do I use also http and https proxy?
class ProxyMiddleware(object):
def process_request(self, request, spider):
request.meta['proxy'] = "http://YOUR_PROXY_IP:PORT"
#like here request.meta['proxy'] = "https://YOUR_PROXY_IP:PORT"
proxy_user_pass = "USERNAME:PASSWORD"
# setup basic authentication for the proxy
encoded_user_pass = base64.encodestring(proxy_user_pass)
request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass
回答1:
You could use standard environment variables with the combination of the HttpProxyMiddleware:
This middleware sets the HTTP proxy to use for requests, by setting the proxy meta value for Request objects.
Like the Python standard library modules urllib and urllib2, it obeys the following environment variables:
http_proxy https_proxy no_proxy
You can also set the meta key proxy per-request, to a value like http://some_proxy_server:port.
来源:https://stackoverflow.com/questions/31313760/how-to-use-http-and-https-proxy-together-in-scrapy