Login to stackoverflow using selenium is working but using scrapy python is not.How can I login with headless browsing?

后端 未结 2 1845
野性不改
野性不改 2021-01-22 11:11

I have been trying to automate login into stackoverflow to learn web scrapping. First I tried scrapy, from which I did not get that lucky using following code.

im         


        
相关标签:
2条回答
  • 2021-01-22 11:16

    In addition to the way by Wim Hermans, you can also POST https://stackoverflow.com/users/login with the following parameters:

    • email: your email
    • password: your password
    • fkey

    Here's an example:

    import requests
    import getpass
    from pyquery import PyQuery
    
    # Fetch the fkey
    login_page = requests.get('https://stackoverflow.com/users/login').text
    pq = PyQuery(login_page)
    fkey = pq('input[name="fkey"]').val()
    
    # Prompt for email and password
    email = input("Email: ")
    password = getpass.getpass()
    
    # Login
    requests.post(
        'https://stackoverflow.com/users/login',
        data = {
            'email': email,
            'password': password,
            'fkey': fkey
        })
    
    0 讨论(0)
  • 2021-01-22 11:37
    • It seems url https://stackoverflow.com/users/login is forbidden by robots.txt, so I'm not sure automating this is allowed by stackoverflow
    • You don't need Selenium to log in. You can just use Scrapy for this. I based myself on the example in their official documentation. You can use the FromRequest.from_response to populate most of the fields needed to login, and just add a correct email & password. The below works for me in scrapy shell:
    from scrapy import FormRequest
    
    url = "https://stackoverflow.com/users/login"
    fetch(url)
    req = FormRequest.from_response(
        response,
        formid='login-form',
        formdata={'email': 'test@test.com',
                  'password': 'testpw'},
        clickdata={'id': 'submit-button'},
    )
    fetch(req)
    
    0 讨论(0)
提交回复
热议问题