How to use pandas to group pivot table results by week?

后端 未结 1 1010
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-05 08:49

Below is a snippet of my pivot table output in .csv format after using pandas pivot_table function:

Sub-Product     11/1/12 11/2/12 11/3/12 11/4/12 11/5/12 1         


        
相关标签:
1条回答
  • 2021-01-05 08:59

    The tool you need is resample, which implicitly uses groupby over a time period/frequency and applies a function like mean or sum.

    Read data.

    In [2]: df
    Out[2]: 
          Sub-Product  11/1/12  11/2/12  11/3/12  11/4/12  11/5/12  11/6/12
    GP   Acquisitions      164      168       54       72      203      167
    GP   Applications      190      207       65       91      227      200
    GPF  Acquisitions     1124     1142      992     1053     1467     1198
    GPF  Applications     1391     1430     1269     1357     1855     1510
    

    Set up a MultiIndex.

    In [4]: df = df.reset_index().set_index(['index', 'Sub-Product'])
    
    In [5]: df
    Out[5]: 
                        11/1/12  11/2/12  11/3/12  11/4/12  11/5/12  11/6/12
    index Sub-Product                                                       
    GP    Acquisitions      164      168       54       72      203      167
          Applications      190      207       65       91      227      200
    GPF   Acquisitions     1124     1142      992     1053     1467     1198
          Applications     1391     1430     1269     1357     1855     1510
    

         Parse the columns as proper datetimes. (They come in as strings.)

    In [6]: df.columns = pd.to_datetime(df.columns)
    
    In [7]: df
    Out[7]: 
                        2012-11-01  2012-11-02  2012-11-03  2012-11-04  \
    index Sub-Product                                                    
    GP    Acquisitions         164         168          54          72   
          Applications         190         207          65          91   
    GPF   Acquisitions        1124        1142         992        1053   
          Applications        1391        1430        1269        1357   
    
                        2012-11-05  2012-11-06  
    index Sub-Product                           
    GP    Acquisitions         203         167  
          Applications         227         200  
    GPF   Acquisitions        1467        1198  
          Applications        1855        1510  
    

    Resample the columns (axis=1) weekly ('w'), summing by week. (how='sum' or how=np.sum are both valid options here.)

    In [10]: df.resample('w', how='sum', axis=1)
    Out[10]: 
                        2012-11-04  2012-11-11
    index Sub-Product                         
    GP    Acquisitions         458         370
          Applications         553         427
    GPF   Acquisitions        4311        2665
          Applications        5447        3365
    
    0 讨论(0)
提交回复
热议问题