How to create multiple dataframes using multiple functions

前端未结

关注

 2  1617

I quite often write a function to return different dataframes based on the parameters I enter. Here\'s an example dataframe:

np.random.seed(1111)
df = pd.Dat


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2021-01-02 17:27
              
            
            
                                                                       
A dictionary would be my first choice:

variations = ([('Units Sold', list_one), ('Dollars_Sold',list_two), 
              ..., ('Title', some_list)])

df_variations = {}

for i, v in enumerate(variations):
     name = v[0]
     data = v[1]
     df_variations[i] = some_fun(df, name, data)


You might further consider setting the keys to unique / helpful titles for the variations, that goes beyond something like 'Units Sold', which isn't unique in your case.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2021-01-02 17:37
              
            
            
                                                                       
IIUC,

as Thomas has suggested we can use a dictionary to parse through your data, but with some minor modifications to your function, we can use the dictionary to hold all the required data then pass that through to your function.

the idea is to pass two types of keys, the list of columns and the arguments to your pd.Grouper call.

data_dict = {
    "Units_Sold": {"key": "Date", "freq": "A"},
    "Dollars_Sold": {"key": "Date", "freq": "A"},
    "col_list_1": ["Category", "Product"],
    "col_list_2": ["Category", "Sub-Category", "Sub-Category-2"],
    "col_list_3": ["Sub-Category", "Product"],
}




def some_fun(dataframe, agg_col, dictionary,column_list, *args):

    key = dictionary[agg_col]["key"]

    frequency = dictionary[agg_col]["freq"]

    myList = [pd.Grouper(key=key, freq=frequency), *dictionary[column_list]]

    y = (
        pd.concat(
            [
                dataframe.assign(**{x: "[Total]" for x in myList[i:]})
                .groupby(myList)
                .agg(sumz=(agg_col, "sum"))
                for i in range(1, len(myList) + 1)
            ]
        )
        .sort_index()
        .unstack(0)
    )
    return y




Test.

df1 = some_fun(df,'Units_Sold',data_dict,'col_list_3')
print(df1)
                                 sumz                      
Date                   2016-12-31 2017-12-31 2018-12-31
Sub-Category Product                                   
X            Product 1      18308      17839      18776
             Product 2      18067      19309      18077
             Product 3      17943      19121      17675
             [Total]        54318      56269      54528
Y            Product 1      20699      18593      18103
             Product 2      18642      19712      17122
             Product 3      17701      19263      20123
             [Total]        57042      57568      55348
Z            Product 1      19077      17401      19138
             Product 2      17207      21434      18817
             Product 3      18405      17300      17462
             [Total]        54689      56135      55417
[Total]      [Total]       166049     169972     165293


as you want to automate the writing of the 10x worksheets, we can again do that with a dictionary call over your function:

matches = {'Units_Sold': ['col_list_1','col_list_3'],
          'Dollars_Sold' : ['col_list_2']}


then a simple for loop to write all the files to a single excel sheet, change this to match your required behavior.

writer = pd.ExcelWriter('finished_excel_file.xlsx')
for key,value in matches.items():
    for items in value:        
        dataframe = some_fun(df,k,data_dict,items)
        dataframe.to_excel(writer,f'{key}_{items}')
writer.save()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复