scrape multiple linked HTML tables in R and rvest

后端 未结 2 801
花落未央
花落未央 2021-02-03 11:35

This article http://www.ajnr.org/content/30/7/1402.full contains four links to html-tables which I would like to scrape with rvest.

With help of the css selector:

<
2条回答
  •  刺人心
    刺人心 (楼主)
    2021-02-03 12:04

    Here's one approach:

    library(rvest)
    
    url <- "http://www.ajnr.org/content/30/7/1402.full"
    page <- read_html(url)
    
    # First find all the urls
    table_urls <- page %>% 
      html_nodes(".table-inline li:nth-child(1) a") %>%
      html_attr("href") %>%
      xml2::url_absolute(url)
    
    # Then loop over the urls, downloading & extracting the table
    lapply(table_urls, . %>% read_html() %>% html_table())
    

提交回复
热议问题