Login to stackoverflow using selenium is working but using scrapy python is not.How can I login with headless browsing?

后端 未结 2 1844
野性不改
野性不改 2021-01-22 11:11

I have been trying to automate login into stackoverflow to learn web scrapping. First I tried scrapy, from which I did not get that lucky using following code.

im         


        
2条回答
  •  醉话见心
    2021-01-22 11:37

    • It seems url https://stackoverflow.com/users/login is forbidden by robots.txt, so I'm not sure automating this is allowed by stackoverflow
    • You don't need Selenium to log in. You can just use Scrapy for this. I based myself on the example in their official documentation. You can use the FromRequest.from_response to populate most of the fields needed to login, and just add a correct email & password. The below works for me in scrapy shell:
    from scrapy import FormRequest
    
    url = "https://stackoverflow.com/users/login"
    fetch(url)
    req = FormRequest.from_response(
        response,
        formid='login-form',
        formdata={'email': 'test@test.com',
                  'password': 'testpw'},
        clickdata={'id': 'submit-button'},
    )
    fetch(req)
    

提交回复
热议问题