Scraping with rvest - complete with NAs when tag is not present

后端 未结 4 2103
清酒与你
清酒与你 2020-11-30 12:20

I want to parse this HTML: and get this elements from it:

a) p tag, with class: \"normal_encontrado\".
b) div with c

4条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-30 12:25

    Go one level up from your target and lapply over each parent element:

    library(xml2)
    library(rvest)
    
    pg <- read_html('
    
    
    
    

    S/. 2,799.00

    S/. 2,299.00
    S/. 4,999.00
    ') prod <- html_nodes(pg, "div.product_price") do.call(rbind, lapply(prod, function(x) { norm <- tryCatch(xml_text(xml_node(x, "p.normal_encontrado")), error=function(err) {NA}) price <- tryCatch(xml_text(xml_node(x, "div.price")), error=function(err) {NA}) data.frame(norm, price, stringsAsFactors=FALSE) })) ## norm price ## 1 \n S/. 2,799.00\n \n S/. 2,299.00\n ## 2 \n S/. 4,999.00\n

    I have no idea if you wanted the strings trimmed or anything else done, but those machinations are pretty easy.

提交回复
热议问题