getting value of location header using python urllib2

岁酱吖の 提交于 2019-12-10 10:49:35

问题


when I use urllib2,and list the headers,I cannot see the 'Location' header.

In [19]:p = urllib2.urlopen('http://www.example.com')


In [21]: p.headers.items()
Out[21]: 
[('transfer-encoding', 'chunked'),
 ('vary', 'Accept-Encoding'),
 ('server', 'Apache/2.2.3 (CentOS)'),
 ('last-modified', 'Wed, 09 Feb 2011 17:13:15 GMT'),
 ('connection', 'close'),
 ('date', 'Fri, 25 May 2012 03:00:02 GMT'),
 ('content-type', 'text/html; charset=UTF-8')]

If I use telnet and GET

telnet www.example.com 80
Trying 192.0.43.10...
Connected to www.example.com.
Escape character is '^]'.
GET / HTTP/1.0  
Host:www.example.com

HTTP/1.0 302 Found
Location: http://www.iana.org/domains/example/
Server: BigIP
Connection: close
Content-Length: 0

So, using urllib2 , how do I get the value of 'Location' header?


回答1:


Use the geturl method on the returned file-like object from urlopen:

>>> f = urllib2.urlopen('http://www.example.com')
>>> f.geturl()
'http://www.iana.org/domains/example/'



回答2:


This is because by default urllib2 follows location headers. So the final response will not have one. If you disable following redirects suddenly you can see the location headers of 301 and 302 pages. See: How do I prevent Python's urllib(2) from following a redirect

Borrowing from there:

class NoRedirection(urllib2.HTTPErrorProcessor):
  def http_response(self, request, response):
    return response
  https_response = http_response

opener = urllib2.build_opener(NoRedirection)
location = opener.open('http://www.example.com').info().getheader('Location')


来源:https://stackoverflow.com/questions/10748079/getting-value-of-location-header-using-python-urllib2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!