Logging into website with multiple pages using Python (urllib2 and cookielib)

谁说胖子不能爱 提交于 2019-12-08 05:02:36

问题


I am writing a script to retrieve transaction information from my bank's home banking website for use in a personal mobile application.

The website is laid out like so:

https:/ /homebanking.purduefed.com/OnlineBanking/Login.aspx

-> Enter username -> Submit form ->

https:/ /homebanking.purduefed.com/OnlineBanking/AOP/Password.aspx

-> Enter password -> Submit form ->

https:/ /homebanking.purduefed.com/OnlineBanking/AccountSummary.aspx

The problem I am having is since there are 2 separate pages to make POSTs, I first thought it was a problem with the session information being lost. But I use urllib2's HTTPCookieProcessor to store the cookies and make GET and POST requests to the website, and have found that this isn't the issue.

My current code is:

import urllib
import urllib2
import cookielib

loginUrl = 'https://homebanking.purduefed.com/OnlineBanking/Login.aspx'
passwordUrl = 'https://homebanking.purduefed.com/OnlineBanking/AOP/Password.aspx'
acctUrl = 'https://homebanking.purduefed.com/OnlineBanking/AccountSummary.aspx'

LoginName = 'sample_username'
Password = 'sample_password'

values = {'LoginName' : LoginName,
      'Password' : Password}

class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        print "Cookie Manipulation Right Here"
        return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)

    http_error_301 = http_error_303 = http_error_307 = http_error_302

login_cred = urllib.urlencode(values)

jar = cookielib.CookieJar()
cookieprocessor = urllib2.HTTPCookieProcessor(jar)

opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor)
urllib2.install_opener(opener)
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5')]

opener.addheader = [('Referer', loginUrl)]
response = opener.open(loginUrl, login_cred)

reqPage = opener.open(passwordUrl)

opener.addheader = [('Referer', passwordUrl)]
response2 = opener.open(passwordUrl, login_cred)

reqPage2 = opener.open(acctUrl)

content = reqPage2.read()

Currently, the script makes it to the passwordUrl page, so the username is POSTed correctly, but when the POST is made to the passwordUrl page, instead of going to the acctUrl, it is redirected to the Login page (the redirect location if acctUrl is opened without proper or a lack of credentials).

Any thoughts or comments on how to move forward are greatly appreciated at this point!

来源:https://stackoverflow.com/questions/15605408/logging-into-website-with-multiple-pages-using-python-urllib2-and-cookielib

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!