Merging two columns into one in R

后端未结

关注

 7  903

I have the following data frame, and am trying to merge the two columns into one, while replacing NA\'s with the numeric values.

ID    A     B
1


                      
              相关标签:


      
      
        
          7条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  小鲜肉        
                
              
                            
                2020-11-29 02:23
              
            
            
                                                                       
You could try

New <- do.call(pmax, c(df1[-1], na.rm=TRUE))


Or

New <-  df1[-1][cbind(1:nrow(df1),max.col(!is.na(df1[-1])))]
d1 <- data.frame(ID=df1$ID, New)
d1
#  ID New
#1  1   3
#2  2   2
#3  3   4
#4  4   1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2020-11-29 02:25
              
            
            
                                                                       
Another very simple solution in this case is to use the rowSums function.

df$New<-rowSums(df[, c("A", "B")], na.rm=T)
df<-df[, c("ID", "New")]


Update:
Thanks @Artem Klevtsov for mentioning that this method only works with numeric data.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2020-11-29 02:40
              
            
            
                                                                       
This probably didn't exist when the answers were written, but since I came here with the same question and found a better solution, here it is for future googlers:

What you want is the coalesce() function from dplyr:

y <- c(1, 2, NA, NA, 5)
z <- c(NA, NA, 3, 4, 5)
coalesce(y, z)

[1] 1 2 3 4 5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2020-11-29 02:41
              
            
            
                                                                       
You can use unite from tidyr:

library(tidyr)

df[is.na(df)] = ''
unite(df, new, A:B, sep='')
#  ID new
#1  1   3
#2  2   2
#3  3   4
#4  4   1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2020-11-29 02:41
              
            
            
                                                                       
Assuming either A or B have a NA, that would work just fine:

# creating initial data frame (actually data.table in this case)
library(data.table)
x<- as.data.table(list(ID = c(1,2,3,4), A = c(3, NA, NA, 1), B = c(NA, 2, 4, NA)))
x
#   ID  A  B
#1:  1  3 NA
#2:  2 NA  2
#3:  3 NA  4
#4:  4  1 NA


#solution
y[,New := na.omit(c(A,B)), by = ID][,c("A","B"):=NULL]
y
#   ID New
#1:  1   3
#2:  2   2
#3:  3   4
#4:  4   1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2020-11-29 02:43
              
            
            
                                                                       
This question's been around for a while, but just to add another possible approach that does not depend on any libraries:

df$new = t(df[-1])[!is.na(t(df[-1]))]

#   ID  A  B new
# 1  1  3 NA   3
# 2  2 NA  2   2
# 3  3 NA  4   4
# 4  4  1 NA   1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复