302s and losing cookies with urllib2

后端 未结 4 1911
旧时难觅i
旧时难觅i 2021-01-21 06:13

I am using liburl2 with CookieJar / HTTPCookieProcessor in an attempt to simulate a login to a page to automate an upload.

I\'ve seen some questions and answers on this,

4条回答
  •  自闭症患者
    2021-01-21 07:05

    I have been having the exact same problem recently but in the interest of time scrapped it and decided to go with mechanize. It can be used as a total replacement for urllib2 that behaves exactly as you would expect a browser to behave with regards to Referer headers, redirects, and cookies.

    import mechanize
    cj = mechanize.CookieJar()
    browser = mechanize.Browser()
    browser.set_cookiejar(cj)
    browser.set_proxies({'http': '127.0.0.1:8888'})
    
    # Use browser's handlers to create a new opener
    opener = mechanize.build_opener(*browser.handlers)
    

    The Browser object can be used as an opener itself (using the .open() method). It maintains state internally but also returns a response object on every call. So you get a lot of flexibility.

    Also, if you don't have a need to inspect the cookiejar manually or pass it along to something else, you can omit the explicit creation and assignment of that object as well.

    I am fully aware this doesn't address what is really going on and why urllib2 can't provide this solution out of the box or at least without a lot of tweaking, but if you're short on time and just want it to work, just use mechanize.

提交回复
热议问题