Extract links from html table

后端未结

关注

 2  1499

情书的邮戳 2020-12-16 21:42

I\'m trying to extract the links from the following webpage http://ipt.humboldt.org.co/ that are of type \"Specimen\". I can get the table from the webpage using the followi

2条回答

Happy的楠姐 (楼主)

2020-12-16 22:32

xmlFun<-function(x){
   y<-xpathSApply(x,'./a',xmlAttrs)
   if(length(y)>0){
      list(href=y,orig=xmlValue(x))
   }else{
      xmlValue(x)
   }
}
ans<-readHTMLTable(tableNodes[[1]],elFun=xmlFun,stringsAsFactors = FALSE)
ans$Name<-lapply(ans$Name,function(x){unlist(eval(parse(text=x)))})
ans$Name[ans$Subtype=='Specimen']

0 讨论(0)

查看其它2个回答