问题
What i am trying to do is read a line(an ip address), open the website with that address, and then repeat with all the addresses in the file. instead, i get an error. I am new to python, so maybe its a simple mistake. Thanks in advance !!!
CODE:
>>> f = open("proxy.txt","r"); #file containing list of ip addresses
>>> address = (f.readline()).strip(); # to remove \n at end of line
>>>
>>> while line:
proxy = urllib2.ProxyHandler({'http': address })
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
address = (f.readline()).strip();
ERROR:
Traceback (most recent call last):
File "<pyshell#15>", line 5, in <module>
urllib2.urlopen('http://www.google.com')
File "D:\Programming\Python\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "D:\Programming\Python\lib\urllib2.py", line 394, in open
response = self._open(req, data)
File "D:\Programming\Python\lib\urllib2.py", line 412, in _open
'_open', req)
File "D:\Programming\Python\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "D:\Programming\Python\lib\urllib2.py", line 1199, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "D:\Programming\Python\lib\urllib2.py", line 1174, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
回答1:
It means that the proxy is unavailable.
Here's a proxy checker that checks several proxies simultaneously:
#!/usr/bin/env python
import fileinput # accept proxies from files or stdin
try:
from gevent.pool import Pool # $ pip install gevent
import gevent.monkey; gevent.monkey.patch_all() # patch stdlib
except ImportError: # fallback on using threads
from multiprocessing.dummy import Pool
try:
from urllib2 import ProxyHandler, build_opener
except ImportError: # Python 3
from urllib.request import ProxyHandler, build_opener
def is_proxy_alive(proxy, timeout=5):
opener = build_opener(ProxyHandler({'http': proxy})) # test redir. and such
try: # send request, read response headers, close connection
opener.open("http://example.com", timeout=timeout).close()
except EnvironmentError:
return None
else:
return proxy
candidate_proxies = (line.strip() for line in fileinput.input())
pool = Pool(20) # use 20 concurrent connections
for proxy in pool.imap_unordered(is_proxy_alive, candidate_proxies):
if proxy is not None:
print(proxy)
Usage:
$ python alive-proxies.py proxy.txt
$ echo user:password@ip:port | python alive-proxies.py
来源:https://stackoverflow.com/questions/16746897/using-multiple-proxies-to-open-a-link-in-urllib2