问题
I've written a script in python in combination with selenium to log into a site and then transfer cookies from driver
to requests
so that I can go ahead using requests
to do further activities.
I used item = soup.select_one("div[class^='gravatar-wrapper-']").get("title")
this line to check whether the script can fetch my username when everything is done.
This is my try so far:
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
url = "https://stackoverflow.com/users/login"
driver = webdriver.Chrome()
driver.get(url)
driver.find_element_by_css_selector("#email").send_keys("your_username")
driver.find_element_by_css_selector("#password").send_keys("your_password")
driver.find_element_by_css_selector("#submit-button").click()
driver_cookies = driver.get_cookies()
c = {c['name']:c['value'] for c in driver_cookies}
res = requests.get(driver.current_url,cookies=c)
soup = BeautifulSoup(res.text,"lxml")
item = soup.select_one("div[class^='gravatar-wrapper-']").get("title")
print(item)
driver.quit()
When I run my script it doesn't find the username and gives None as output.
How can I pass cookies between selenium
and requests
in order to do the scraping using requests
after I log in using selenium?
回答1:
You are already on the right track. All you need to do now is make the script wait a little for the cookies to load. This is how you can get the response:
import time
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
url = "https://stackoverflow.com/users/login"
with webdriver.Chrome() as driver:
driver.get(url)
driver.find_element_by_css_selector("#email").send_keys("your_username")
driver.find_element_by_css_selector("#password").send_keys("your_password")
driver.find_element_by_css_selector("#submit-button").click()
time.sleep(5) #This is the fix
driver_cookies = driver.get_cookies()
c = {c['name']:c['value'] for c in driver_cookies}
res = requests.get(driver.current_url,cookies=c)
soup = BeautifulSoup(res.text,"lxml")
item = soup.select_one("div[class^='gravatar-wrapper-']").get("title")
print(item)
回答2:
In My case this helped me let us know if this works in your case..
import requests
from selenium import webdriver
driver = webdriver.Firefox()
url = "some_url" #a redirect to a login page occurs
driver.get(url)
#storing the cookies generated by the browser
request_cookies_browser = driver.get_cookies()
#making a persistent connection using the requests library
params = {'os_username':'username', 'os_password':'password'}
s = requests.Session()
#passing the cookies generated from the browser to the session
c = [s.cookies.set(c['name'], c['value']) for c in request_cookies_browser]
resp = s.post(url, params) #I get a 200 status_code
#passing the cookie of the response to the browser
dict_resp_cookies = resp.cookies.get_dict()
response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
c = [driver.add_cookie(c) for c in response_cookies_browser]
#the browser now contains the cookies generated from the authentication
driver.get(url)
回答3:
Try using selenium-requests.
Extends Selenium WebDriver classes to include the request function from the Requests library, while doing all the needed cookie and request headers handling.
来源:https://stackoverflow.com/questions/54398127/unable-to-pass-cookies-between-selenium-and-requests-in-order-to-do-the-scraping