I want to parse this HTML: and get this elements from it:
a) p tag, with class: \"normal_encontrado\".
b) div with c
If the tag is not found, rvest returns a character(0). So assuming you will find at most one current and one regular price in each div.product_price, you can use this:
pacman::p_load("rvest", "dplyr")
get_prices <- function(node){
r.precio.antes <- html_nodes(node, 'p.normal_encontrado') %>% html_text
r.precio.actual <- html_nodes(node, 'div.price') %>% html_text
data.frame(
precio.antes = ifelse(length(r.precio.antes)==0, NA, r.precio.antes),
precio.actual = ifelse(length(r.precio.actual)==0, NA, r.precio.actual),
stringsAsFactors=F
)
}
doc <- read_html('test.html') %>% html_nodes("div.product_price")
lapply(doc, get_prices) %>%
rbind_all
Edited: I misunderstood the input data, so changed the script to work with just a single html page.