Proxy with urllib2

拜拜、爱过 提交于 2019-12-16 20:17:52

问题


I open urls with:

site = urllib2.urlopen('http://google.com')

And what I want to do is connect the same way with a proxy I got somewhere telling me:

site = urllib2.urlopen('http://google.com', proxies={'http':'127.0.0.1'})

but that didn't work either.

I know urllib2 has something like a proxy handler, but I can't recall that function.


回答1:


proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')



回答2:


You have to install a ProxyHandler

urllib2.install_opener(
    urllib2.build_opener(
        urllib2.ProxyHandler({'http': '127.0.0.1'})
    )
)
urllib2.urlopen('http://www.google.com')



回答3:


You can set proxies using environment variables.

import os
os.environ['http_proxy'] = '127.0.0.1'
os.environ['https_proxy'] = '127.0.0.1'

urllib2 will add proxy handlers automatically this way. You need to set proxies for different protocols separately otherwise they will fail (in terms of not going through proxy), see below.

For example:

proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
# next line will fail (will not go through the proxy) (https)
urllib2.urlopen('https://www.google.com')

Instead

proxy = urllib2.ProxyHandler({
    'http': '127.0.0.1',
    'https': '127.0.0.1'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# this way both http and https requests go through the proxy
urllib2.urlopen('http://www.google.com')
urllib2.urlopen('https://www.google.com')



回答4:


To use the default system proxies (e.g. from the http_support environment variable), the following works for the current request (without installing it into urllib2 globally):

url = 'http://www.example.com/'
proxy = urllib2.ProxyHandler()
opener = urllib2.build_opener(proxy)
in_ = opener.open(url)
in_.read()



回答5:


In Addition to the accepted answer: My scipt gave me an error

File "c:\Python23\lib\urllib2.py", line 580, in proxy_open
    if '@' in host:
TypeError: iterable argument required

Solution was to add http:// in front of the proxy string:

proxy = urllib2.ProxyHandler({'http': 'http://proxy.xy.z:8080'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')



回答6:


One can also use requests if we would like to access a web page using proxies. Python 3 code:

>>> import requests
>>> url = 'http://www.google.com'
>>> proxy = '169.50.87.252:80'
>>> requests.get(url, proxies={"http":proxy})
<Response [200]>

More than one proxies can also be added.

>>> proxy1 = '169.50.87.252:80'
>>> proxy2 = '89.34.97.132:8080'
>>> requests.get(url, proxies={"http":proxy1,"http":proxy2})
<Response [200]>



回答7:


In addition set the proxy for the command line session Open a command line where you might want to run your script

netsh winhttp set proxy YourProxySERVER:yourProxyPORT

run your script in that terminal.



来源:https://stackoverflow.com/questions/1450132/proxy-with-urllib2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!