R: subset a data frame based on conditions from another data frame

后端未结

关注

 2  2081

-上瘾入骨i 2021-01-14 18:33

Here is a problem I am trying to solve. Say, I have two data frames like the following:

observations <- data.frame(id = rep(rep(c(1,2,3,4), each=5), 5),


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   青春惊慌失措
                                             
                
                
                (楼主)
            
              
              
                2021-01-14 19:04
              

            
            
                        
Not efficient , but do the job :   

 subset(merge(observations,sampletimes), time > time1 & time < time2)
        id time measurement location time1 time2
    11   1    3    3.180321        a     2     4
    47   1    8    6.040612        e     7     9
    83   1   13   -5.999317        i    12    14
    99   1   18    2.689414        m    17    19
    125  1   23   12.514722        q    22    24
    137  2    8    4.420679        f     7     9
    141  2    3   11.492446        b     2     4
    218  2   13    6.672506        j    12    14
    234  2   18   12.290339        n    17    19
    250  2   23   12.610828        r    22    24
    251  3    3    8.570984        c     2     4
    267  3    8   -7.112291        g     7     9
    283  3   13    6.287598        k    12    14
    360  3   23   11.941846        s    22    24
    364  3   18   -4.199001        o    17    19
    376  4    3    7.133370        d     2     4
    402  4    8   13.477790        h     7     9
    418  4   13    3.967293        l    12    14
    454  4   18   12.845535        p    17    19
    490  4   23   -1.016839        t    22    24


EDIT

Since you have more than 5 millions rows, you should give a try to a data.table solution:

library(data.table)
OBS <- data.table(observations)
SAM <- data.table(sampletimes)
merge(OBS,SAM,allow.cartesian=TRUE,by='id')[time > time1 & time < time2]

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复