Extract Links from Webpage using R

前端未结

关注

 3  536

庸人自扰 2020-12-23 02:07

The two posts below are great examples of different approaches of extracting data from websites and parsing it into R.

Scraping html tables into R data frames usin

3条回答

南方客 (楼主)

2020-12-23 03:10

Even easier with rvest:

library(xml2)
library(rvest)

URL <- "http://stackoverflow.com/questions/3746256/extract-links-from-webpage-using-r"

pg <- read_html(URL)

head(html_attr(html_nodes(pg, "a"), "href"))

## [1] "//stackoverflow.com"                                                                                                                                          
## [2] "http://chat.stackoverflow.com"                                                                                                                                
## [3] "//stackoverflow.com"                                                                                                                                          
## [4] "http://meta.stackoverflow.com"                                                                                                                                
## [5] "//careers.stackoverflow.com?utm_source=stackoverflow.com&utm_medium=site-ui&utm_campaign=multicollider"                                                       
## [6] "https://stackoverflow.com/users/signup?ssrc=site_switcher&returnurl=http%3a%2f%2fstackoverflow.com%2fquestions%2f3746256%2fextract-links-from-webpage-using-r"

0 讨论(0)

查看其它3个回答