python pandas pivot_table count frequency in one column

前端 未结 4 1008
暗喜
暗喜 2020-12-08 00:59

I am still new to Python pandas\' pivot_table and would like to ask a way to count frequencies of values in one column, which is also linked to another column of ID. The Dat

相关标签:
4条回答
  • 2020-12-08 01:25

    You can use count df.pivot_table(index='Account_number', columns='Product', aggfunc='count')

    0 讨论(0)
  • 2020-12-08 01:26

    In new version of Pandas, slight modification is required. I had to spend some time figuring out so just wanted to add that here so that someone can directly use this.

    df.pivot_table(index='Account_number', columns='Product', aggfunc=len,
                   fill_value=0)
    
    0 讨论(0)
  • 2020-12-08 01:29

    Solution: Use aggfunc='size'

    Using aggfunc=len or aggfunc='count' like all the other answers on this page will not work for DataFrames with more than three columns. By default, pandas will apply this aggfunc to all the columns not found in index or columns parameters.

    For instance, if we had two more columns in our original DataFrame defined like this:

    df = pd.DataFrame({'Account_number':[1, 1, 2 ,2 ,2 ,3 ,3], 
                       'Product':['A', 'A', 'A', 'B', 'B','A', 'B'], 
                       'Price': [10] * 7,
                       'Quantity': [100] * 7})
    

    Output:

       Account_number Product  Price  Quantity
    0               1       A     10       100
    1               1       A     10       100
    2               2       A     10       100
    3               2       B     10       100
    4               2       B     10       100
    5               3       A     10       100
    6               3       B     10       100
    

    If you apply the current solutions to this DataFrame, you would get the following:

    df.pivot_table(index='Account_number',
                   columns='Product',
                   aggfunc=len,
                   fill_value=0)
    

    Output:

                      Price    Quantity   
    Product            A  B        A  B
    Account_number                     
    1                  2  0        2  0
    2                  1  2        1  2
    3                  1  1        1  1
    

    Solution

    Instead, use aggfunc='size'. Since size always returns the same number for each column, pandas does not call it on every single column and just does it once.

    df.pivot_table(index='Account_number', 
                   columns='Product',
                   aggfunc='size',
                   fill_value=0)
    

    Output:

    Product         A  B
    Account_number      
    1               2  0
    2               1  2
    3               1  1
    
    0 讨论(0)
  • 2020-12-08 01:33

    You need to specify the aggfunc as len:

    In [11]: df.pivot_table(index='Account_number', columns='Product', 
                            aggfunc=len, fill_value=0)
    Out[11]:
    Product         A  B
    Account_number
    1               2  0
    2               1  2
    3               1  1
    

    It looks like count, is counting the instances of each column (Account_number and Product), it's not clear to me whether this is a bug...

    0 讨论(0)
提交回复
热议问题