Extract links from html table

后端 未结 2 1499
情书的邮戳
情书的邮戳 2020-12-16 21:42

I\'m trying to extract the links from the following webpage http://ipt.humboldt.org.co/ that are of type \"Specimen\". I can get the table from the webpage using the followi

2条回答
  •  Happy的楠姐
    2020-12-16 22:32

    xmlFun<-function(x){
       y<-xpathSApply(x,'./a',xmlAttrs)
       if(length(y)>0){
          list(href=y,orig=xmlValue(x))
       }else{
          xmlValue(x)
       }
    }
    ans<-readHTMLTable(tableNodes[[1]],elFun=xmlFun,stringsAsFactors = FALSE)
    ans$Name<-lapply(ans$Name,function(x){unlist(eval(parse(text=x)))})
    ans$Name[ans$Subtype=='Specimen']
    

提交回复
热议问题