问题
I would like to scrape the match result table from the website https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018
I m using rvest package with following code:
library(rvest)
url.tournament <- "https://www.whoscored.com/Regions/247/Tournaments/36/Seasons/5967/Stages/15737/Fixtures/International-FIFA-World-Cup-2018"
df.tournament <- read_html(url.tournament) %>%
html_nodes(xpath='//*[@id="tournament-fixture-wrapper"]') %>%
html_nodes("table")
html_table()
while no element is extracted.
回答1:
Looking at the website’s source code you can see that the table doesn’t actually exist in the HTML source — it’s dynamically generated using JavaScript. That’s why your XPath query returns an empty <div>
.
You consequently can’t rely on {rvest} in this case, you need to use a dynamic scraper such as {RSelenium}, which can interpret JavaScript.
来源:https://stackoverflow.com/questions/51025719/webscraping-soccer-data-returns-nothing