How to use http and https proxy together in scrapy?

淺唱寂寞╮ 提交于 2021-02-08 07:56:37

问题


I am new in scrapy. I found that for use http proxy but I want to use http and https proxy together because when I crawl the links there has http and https links. How do I use also http and https proxy?

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        request.meta['proxy'] = "http://YOUR_PROXY_IP:PORT"
        #like here request.meta['proxy'] = "https://YOUR_PROXY_IP:PORT"
        proxy_user_pass = "USERNAME:PASSWORD"
        # setup basic authentication for the proxy
        encoded_user_pass = base64.encodestring(proxy_user_pass)
        request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass

回答1:


You could use standard environment variables with the combination of the HttpProxyMiddleware:

This middleware sets the HTTP proxy to use for requests, by setting the proxy meta value for Request objects.

Like the Python standard library modules urllib and urllib2, it obeys the following environment variables:

http_proxy
https_proxy
no_proxy

You can also set the meta key proxy per-request, to a value like http://some_proxy_server:port.



来源:https://stackoverflow.com/questions/31313760/how-to-use-http-and-https-proxy-together-in-scrapy

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!