问题
I am trying to webscrape the racename ('The Valley R2') and the horse name ('Ronniejay') from the following website https://www.punters.com.au/form-guide/form-finder/e2a0f7e13bf0057b4c156aea23019b18.
What is the correct soup.find() code to do this.
My code to get the race name:
from bs4 import BeautifulSoup
import requests
source = requests.get('https://www.punters.com.au/form-guide/form-finder/e2a0f7e13bf0057b4c156aea23019b18').text
soup = BeautifulSoup(source,'lxml')
race = soup.find('h3')
print(race)
回答1:
The website uses JavaScript, but requests doesn't support it. We can use Selenium as an alternative to scrape the page.
Install it with: pip install selenium.
Download the correct ChromeDriver from here.
from selenium import webdriver
from bs4 import BeautifulSoup
from time import sleep
URL = "https://www.punters.com.au/form-guide/form-finder/e2a0f7e13bf0057b4c156aea23019b18"
driver = webdriver.Chrome(r"C:\path\to\chromedriver.exe")
driver.get(URL)
# Wait for page to fully render
sleep(5)
soup = BeautifulSoup(driver.page_source, "lxml")
race_name = soup.select_one(".form-result-group__event span").text
horse_name = "".join(
x for x in soup.select_one(".form-result__competitor-name").text if x.isalpha()
)
print(race_name)
print(horse_name)
driver.quit()
Output:
The Valley R2
Ronniejay
来源:https://stackoverflow.com/questions/64139449/what-is-the-correct-soup-find-command