Scraping with rvest - complete with NAs when tag is not present

后端未结
关注
 4  2109
清酒与你 2020-11-30 12:20
I want to parse this HTML: and get this elements from it:
a) p tag, with class: \"normal_encontrado\".
b) div with c

      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   野趣味
                                             
                
                
                (楼主)
            
              
              
                2020-11-30 12:25
              

            
            
                        
Go one level up from your target and lapply over each parent element:

library(xml2)
library(rvest)

pg <- read_html('




  
    S/. 2,799.00
  

  
    S/. 2,299.00
      



  
    S/. 4,999.00
  


')

prod <- html_nodes(pg, "div.product_price")
do.call(rbind, lapply(prod, function(x) {
  norm <- tryCatch(xml_text(xml_node(x, "p.normal_encontrado")),
                   error=function(err) {NA})
  price <- tryCatch(xml_text(xml_node(x, "div.price")),
                    error=function(err) {NA})
  data.frame(norm, price, stringsAsFactors=FALSE)
}))

##                     norm                  price
## 1 \n    S/. 2,799.00\n   \n    S/. 2,299.00\n  
## 2                    \n    S/. 4,999.00\n  


I have no idea if you wanted the strings trimmed or anything else done, but those machinations are pretty easy.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复