Unable to log in to Amazon using Python

前端 未结 2 667
广开言路
广开言路 2020-12-17 05:39

I\'m using Python 3 to write a script to log in to Amazon to grab my Kindle highlights. It is based on this article: https://blog.jverkamp.com/2015/07/02/scraping-kindle-hig

相关标签:
2条回答
  • 2020-12-17 06:13

    Your signin form data is actually not correct it should be email and password:

    signin_data[u'email'] = 'your_email'
    signin_data[u'password'] = 'your_password'
    

    You can also avoid the try with a css select and has_attr:

    import bs4, requests
    
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36'
    }
    
    from bs4 import BeautifulSoup
    
    with requests.Session() as s:
        s.headers = headers
        r = s.get('https://kindle.amazon.com/login')
        soup = BeautifulSoup(r.content, "html.parser")
        signin_data = {s["name"]: s["value"]
                       for s in soup.select("form[name=signIn]")[0].select("input[name]")
                       if s.has_attr("value")}
    
        signin_data[u'email'] = 'your_em'
        signin_data[u'password'] = 'pass'
    
        response = s.post('https://www.amazon.com/ap/signin', data=signin_data)
        soup = bs4.BeautifulSoup(response.text, "html.parser")
        warning = soup.find('div', {'id': 'message_warning'})
        if warning:
            print('Failed to login: {0}'.format(warning.text))
        print(response.content)
    

    The first line of the output, you can see <title>Amazon Kindle: Home</title> at the end:

    b'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\n<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">\n  <head>\n    <title>Amazon Kindle: Home</title>\n  
    

    If it is not working still, you should update your version of requests and maybe try another user-agent. Once I changed the ap_email and ap_password I logged in fine.

    0 讨论(0)
  • 2020-12-17 06:24

    2020 - this code will no longer work. Amazon has added JavaScript to its sign in pages which if not executed, make this sequence fail. Retrieved pages will state cookies are not enabled even though they are and work. Sending both username and password together results in a verification page response which included a captcha. Sending username then sending password in a 2nd exchange results in the reply “something went wrong” and will ask for username/password again. Amazon recognizes the JavaScript was not executed.

    0 讨论(0)
提交回复
热议问题