Pivoting rows into columns

后端未结

关注

 4  1628

我寻月下人不归 2021-02-05 15:57

Suppose (to simplify) I have a table containing some control vs. treatment data:

Which, Color, Response, Count
Control, Red, 2, 10
Control, Blue, 3, 20
Treatment


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   無奈伤痛
                                             
                
                
                (楼主)
            
              
              
                2021-02-05 16:15
              

            
            
                        
To add to the options (many years later)....

The typical approach in base R would involve the reshape function (which is generally unpopular because of the multitude of arguments that take time to master). It's a pretty efficient function for smaller datasets, but doesn't always scale well.

reshape(mydf, direction = "wide", idvar = "Color", timevar = "Which")
#   Color Response.Control Count.Control Response.Treatment Count.Treatment
# 1   Red                2            10                  1              14
# 2  Blue                3            20                  4              21


Already covered are cast/dcast from the "reshape" and "reshape2" (and now, dcast.data.table from "data.table", especially useful when you have large datasets). But also from the Hadleyverse, there's "tidyr", which works nicely with the "dplyr" package:

library(tidyr)
library(dplyr)
mydf %>%
  gather(var, val, Response:Count) %>%  ## make a long dataframe
  unite(RN, var, Which) %>%             ## combine the var and Which columns
  spread(RN, val)                       ## make the results wide
#   Color Count_Control Count_Treatment Response_Control Response_Treatment
# 1  Blue            20              21                3                  4
# 2   Red            10              14                2                  1




Also to note would be that in a forthcoming version of "data.table", the dcast.data.table function should be able to handle this without having to first melt your data.

The data.table implementation of dcast allows you to convert multiple columns to a wide format without melting it first, as follows:

library(data.table)
dcast(as.data.table(mydf), Color ~ Which, value.var = c("Response", "Count"))
#    Color Response_Control Response_Treatment Count_Control Count_Treatment
# 1:  Blue                3                  4            20              21
# 2:   Red                2                  1            10              14

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复