Handling rss redirects with Python/urllib2

前端 未结 2 1828
猫巷女王i
猫巷女王i 2020-12-09 05:21

Calling urrlib2.urlopen on a link to an article fetched from an RSS feed leads to the following error:

urllib2.HTTPError: HTTP Error 301:

相关标签:
2条回答
  • 2020-12-09 05:34

    Turns out you need to enable Cookies. The page redirects to itself after setting a cookie first. Because urllib2 does not handle cookies by default you have to do it yourself.

    import urllib2
    import urllib
    from cookielib import CookieJar
    
    cj = CookieJar()
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    p = opener.open("http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c")
    
    print p.read()
    
    0 讨论(0)
  • 2020-12-09 05:45

    Nothing wrong with @sleeplessnerd's solution, but this is very, very slightly more elegant:

    import urllib2
    url = "http://stackoverflow.com/questions/9926023/handling-rss-redirects-with-python-urllib2"
    p = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(url)
    
    print p.read()
    

    In fact, if you look at the inline documentation for the CookieJar() function, it more-or-less tells you to do things this way:

    You may not need to know about this class: try urllib2.build_opener(HTTPCookieProcessor).open(url)

    0 讨论(0)
提交回复
热议问题