Scraping tables on multiple web pages with rvest in R

十年热恋 提交于 2019-11-30 16:58:24

One way would be to make vector of all the urls you are interested in and then use sapply:

library(rvest)

years <- 1970:2016
urls <- paste0("http://www.baseball-reference.com/teams/MIL/", years, ".shtml")
# head(urls)

get_table <- function(url) {
  url %>%
    read_html() %>%
    html_nodes(xpath = '//*[@id="div_team_batting"]/table[1]') %>% 
    html_table()
}

results <- sapply(urls, get_table)

results should be a list of 47 data.frame objects; each should be named with the url (i.e., year) they represent. That is, results[1] corresponds to 1970, and results[47] corresponds to 2016.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!