Using rvest package when HTML table has two headers

后端未结

关注

 2  1377

礼貌的吻别 2021-01-07 06:17

I am using the following code to scrape an HTML table on AFL player data:

library(rvest)

website <-read_html(\"https://afltables.com/afl/stats/teams/adel


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   长发绾君心
                                             
                
                
                (楼主)
            
              
              
                2021-01-07 07:06
              

            
            
                        
Firstly, and unrelated to your question: Don't use table as a name for your objects, because this name is already reserved for other functionalities in R. It is considered bad practice and I've been told that it will come back and nip you in the butt somewhere down the line.

Moving on to the question: You are struggling with the type of data that html_table() gives you. You are returned a list, which contains a regular data.frame. The list you outputted, has NULL for the number of columns and rows, because that list only has one element: the data.frame. By selecting that first (and only) element of your list, you will get to the dataframe you're actually interesting in. This dataframe has 27 columns and 34 rows

website <-read_html("https://afltables.com/afl/stats/teams/adelaide/2017_gbg.html")
scraped <- website %>%
                html_nodes("table") %>%
                .[(1)] %>%
                html_table() %>%
                `[[`(1)   # Select the first element of the list, like scraped[[1]]
ncol(scraped) 
# 27
nrow(scraped)
# 34

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复