Unable to pass cookies between selenium and requests in order to do the scraping using the latter

问题

I've written a script in python in combination with selenium to log into a site and then transfer cookies from driver to requests so that I can go ahead using requests to do further activities.

I used item = soup.select_one("div[class^='gravatar-wrapper-']").get("title") this line to check whether the script can fetch my username when everything is done.

This is my try so far:

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

url = "https://stackoverflow.com/users/login"

driver = webdriver.Chrome()
driver.get(url)
driver.find_element_by_css_selector("#email").send_keys("your_username")
driver.find_element_by_css_selector("#password").send_keys("your_password")
driver.find_element_by_css_selector("#submit-button").click()

driver_cookies = driver.get_cookies()
c = {c['name']:c['value'] for c in driver_cookies}

res = requests.get(driver.current_url,cookies=c)
soup = BeautifulSoup(res.text,"lxml")
item = soup.select_one("div[class^='gravatar-wrapper-']").get("title")
print(item)
driver.quit()

When I run my script it doesn't find the username and gives None as output.

How can I pass cookies between selenium and requests in order to do the scraping using requests after I log in using selenium?

回答1:

You are already on the right track. All you need to do now is make the script wait a little for the cookies to load. This is how you can get the response:

import time
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

url = "https://stackoverflow.com/users/login"

with webdriver.Chrome() as driver:
    driver.get(url)
    driver.find_element_by_css_selector("#email").send_keys("your_username")
    driver.find_element_by_css_selector("#password").send_keys("your_password")
    driver.find_element_by_css_selector("#submit-button").click()

    time.sleep(5) #This is the fix

    driver_cookies = driver.get_cookies()
    c = {c['name']:c['value'] for c in driver_cookies}
    res = requests.get(driver.current_url,cookies=c)
    soup = BeautifulSoup(res.text,"lxml")
    item = soup.select_one("div[class^='gravatar-wrapper-']").get("title")
    print(item)

回答2:

In My case this helped me let us know if this works in your case..

    import requests
    from selenium import webdriver

    driver = webdriver.Firefox()
    url = "some_url" #a redirect to a login page occurs
    driver.get(url)

    #storing the cookies generated by the browser

    request_cookies_browser = driver.get_cookies()

    #making a persistent connection using the requests library
    params = {'os_username':'username', 'os_password':'password'}
    s = requests.Session()

    #passing the cookies generated from the browser to the session
    c = [s.cookies.set(c['name'], c['value']) for c in request_cookies_browser]

    resp = s.post(url, params) #I get a 200 status_code

    #passing the cookie of the response to the browser
    dict_resp_cookies = resp.cookies.get_dict()
    response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
    c = [driver.add_cookie(c) for c in response_cookies_browser]

    #the browser now contains the cookies generated from the authentication    
    driver.get(url)

回答3:

Try using selenium-requests.

Extends Selenium WebDriver classes to include the request function from the Requests library, while doing all the needed cookie and request headers handling.

来源：https://stackoverflow.com/questions/54398127/unable-to-pass-cookies-between-selenium-and-requests-in-order-to-do-the-scraping

标签

python

python-3.x

selenium

selenium-webdriver

web-scraping