How to replicate a SAS merge

后端未结

关注

 2  396

一向 2020-12-22 14:30

I have two tables, t1 and t2:

t1
  person | visit | code1 | type1
       1       1      50      50 
       1       1      50      50 
       1       2      7


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   -上瘾入骨i
                                             
                
                
                (楼主)
            
              
              
                2020-12-22 15:14
              

            
            
                        
You can replicate a SAS merge by adding a row_number() to each table:

select t1.*, t2.*
from (select t1.*,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t1
     ) t1 full outer join
     (select t2.*,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t2
     ) t2
     on t1.person = t2.person and t1.visit = t2.visit and
        t1.seqnum = t2.seqnum;


Notes:


The ?? means to put in the column(s) used for ordering.  SAS datasets have an intrinsic order.  SQL tables do not, so the ordering needs to be specified.
You should list the columns explicitly (instead of using t1.*, t2.* in the outer query).  I think SAS only includes person and visit once in the resulting dataset.


EDIT:

Note:  the above produces separate values for the key columns.  This is easy enough to fix:

select coalesce(t1.person, t2.person) as person,
       coalesce(t1.key, t2.key) as key,
       t1.code1, t1.type1, t2.code2, t2.type2
from (select t1.*,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t1
     ) t1 full outer join
     (select t2.*,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t2
     ) t2
     on t1.person = t2.person and t1.visit = t2.visit and
        t1.seqnum = t2.seqnum;


That fixes the columns issue.  You can fix the copying issue by using first_value()/last_value() or by using a more complicated join condition:

select coalesce(t1.person, t2.person) as person,
       coalesce(t1.visit, t2.visit) as visit,
       t1.code1, t1.type1, t2.code2, t2.type2
from (select t1.*,
             count(*) over (partition by person, visit) as cnt,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t1
     ) t1 full outer join
     (select t2.*,
             count(*) over (partition by person, visit) as cnt,
             row_number() over (partition by person, visit order by ??) as seqnum
      from t2
     ) t2
     on t1.person = t2.person and t1.visit = t2.visit and
        (t1.seqnum = t2.seqnum or
        (t1.cnt > t2.cnt and t1.seqnum > t2.seqnum and t2.seqnum = t2.cnt) or
        (t2.cnt > t1.cnt and t2.seqnum > t1.seqnum and t1.seqnum = t1.cnt)


This implements the "keep the last row" logic in a single join.  Probably for performance reasons, you would want to put this into separate left joins on the original logic. 
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复