How can I get the final redirect URL when using urllib2.urlopen?

不羁的心 提交于 2019-11-28 22:37:17

问题


I'm using the urllib2.urlopen method to open a URL and fetch the markup of a webpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?


回答1:


Call the .geturl() method of the file object returned. Per the urllib2 docs:

geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed

Example:

import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'



回答2:


The return value of urllib2.urlopen has a geturl() method which should return the actual (i.e. last redirect) url.




回答3:


e.g.: urllib2.urlopen('ORIGINAL LINK').geturl()

urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()




回答4:


You can use HttpLib2 with follow_all_redirects = True and get the content-location from the response headers. See my answer to 'httplib is not getting all the redirect codes' for an example.



来源:https://stackoverflow.com/questions/3556266/how-can-i-get-the-final-redirect-url-when-using-urllib2-urlopen

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!