Using urllib2 via proxy

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-10 10:29:28

问题


I am trying to use urllib2 through a proxy; however, after trying just about every variation of passing my verification details using urllib2, I either get a request that hangs forever and returns nothing or I get 407 Errors. I can connect to the web fine using my browser which connects to a prox-pac and redirects accordingly; however, I can't seem to do anything via the command line curl, wget, urllib2 etc. even if I use the proxies that the prox-pac redirects to. I tried setting my proxy to all of the proxies from the pac-file using urllib2, none of which work.

My current script looks like this:

import urllib2 as url

proxy = url.ProxyHandler({'http': 'username:password@my.proxy:8080'})
auth = url.HTTPBasicAuthHandler()
opener = url.build_opener(proxy, auth, url.HTTPHandler)
url.install_opener(opener)
url.urlopen("http://www.google.com/")

which throws HTTP Error 407: Proxy Authentication Required and I also tried:

import urllib2 as url

handlePass = url.HTTPPasswordMgrWithDefaultRealm()
handlePass.add_password(None, "http://my.proxy:8080", "username", "password")
auth_handler = url.HTTPBasicAuthHandler(handlePass)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
url.urlopen("http://www.google.com")

which hangs like curl or wget timing out.

What do I need to do to diagnose the problem? How is it possible that I can connect via my browser but not from the command line on the same computer using what would appear to be the same proxy and credentials?

Might it be something to do with the router? if so, how can it distinguish between browser HTTP requests and command line HTTP requests?


回答1:


Frustrations like this are what drove me to use Requests. If you're doing significant amounts of work with urllib2, you really ought to check it out. For example, to do what you wish to do using Requests, you could write:

import requests
from requests.auth import HTTPProxyAuth

proxy = {'http': 'http://my.proxy:8080'}
auth = HTTPProxyAuth('username', 'password')
r = requests.get('http://wwww.google.com/', proxies=proxy, auth=auth)
print r.text

Or you could wrap it in a Session object and every request will automatically use the proxy information (plus it will store & handle cookies automatically!):

s = requests.Session(proxies=proxy, auth=auth)
r = s.get('http://www.google.com/')
print r.text


来源:https://stackoverflow.com/questions/14928385/using-urllib2-via-proxy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!