R: data.table count !NA per row

前端未结

关注

 2  760

予麋鹿 2020-12-16 14:24

I am trying to count the number of columns that do not contain NA for each row, and place that value into a new column for that row.

Example data:

li


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   清歌不尽
                                             
                
                
                (楼主)
            
              
              
                2020-12-16 14:41
              

            
            
                        
The two options that quickly come to mind are:

d[, num_obs := sum(!is.na(.SD)), by = 1:nrow(d)][]
d[, num_obs := rowSums(!is.na(d))][]


The first works by creating a "group" of just one row per group (1:nrow(d)). Without that, it would just sum the NA values within the entire table. 

The second makes use of an already very efficient base R function, rowSums.

Here is a benchmark on larger data:

set.seed(1)
nrow = 10000
ncol = 15
d <- as.data.table(matrix(sample(c(NA, -5:10), nrow*ncol, TRUE), nrow = nrow, ncol = ncol))

fun1 <- function(indt) indt[, num_obs := rowSums(!is.na(indt))][]
fun2 <- function(indt) indt[, num_obs := sum(!is.na(.SD)), by = 1:nrow(indt)][]

library(microbenchmark)
microbenchmark(fun1(copy(d)), fun2(copy(d)))
# Unit: milliseconds
#           expr        min         lq       mean     median         uq      max neval
#  fun1(copy(d))   3.727958   3.906458   5.507632   4.159704   4.475201 106.5708   100
#  fun2(copy(d)) 584.499120 655.634889 684.889614 681.054752 712.428684 861.1650   100




By the way, the empty [] is just to print the resulting data.table. This is required when you want to return the output from set* functions in "data.table".
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复