python pandas pivot_table count frequency in one column

前端未结

关注

 4  1013

I am still new to Python pandas\' pivot_table and would like to ask a way to count frequencies of values in one column, which is also linked to another column of ID. The Dat

相关标签:

4条回答

逝去的感伤

2020-12-08 01:25

You can use count df.pivot_table(index='Account_number', columns='Product', aggfunc='count')

0 讨论(0)
发布评论:

提交评论
- 加载中...
礼貌的吻别

2020-12-08 01:26
In new version of Pandas, slight modification is required. I had to spend some time figuring out so just wanted to add that here so that someone can directly use this.
```
df.pivot_table(index='Account_number', columns='Product', aggfunc=len,
               fill_value=0)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

终归单人心

2020-12-08 01:29

Solution: Use aggfunc='size'

Using aggfunc=len or aggfunc='count' like all the other answers on this page will not work for DataFrames with more than three columns. By default, pandas will apply this aggfunc to all the columns not found in index or columns parameters.

For instance, if we had two more columns in our original DataFrame defined like this:

df = pd.DataFrame({'Account_number':[1, 1, 2 ,2 ,2 ,3 ,3], 
                   'Product':['A', 'A', 'A', 'B', 'B','A', 'B'], 
                   'Price': [10] * 7,
                   'Quantity': [100] * 7})

Output:

   Account_number Product  Price  Quantity
0               1       A     10       100
1               1       A     10       100
2               2       A     10       100
3               2       B     10       100
4               2       B     10       100
5               3       A     10       100
6               3       B     10       100

If you apply the current solutions to this DataFrame, you would get the following:

df.pivot_table(index='Account_number',
               columns='Product',
               aggfunc=len,
               fill_value=0)

Output:

                  Price    Quantity   
Product            A  B        A  B
Account_number                     
1                  2  0        2  0
2                  1  2        1  2
3                  1  1        1  1

Solution

Instead, use aggfunc='size'. Since size always returns the same number for each column, pandas does not call it on every single column and just does it once.

df.pivot_table(index='Account_number', 
               columns='Product',
               aggfunc='size',
               fill_value=0)

Output:

Product         A  B
Account_number      
1               2  0
2               1  2
3               1  1

0 讨论(0)

予麋鹿

2020-12-08 01:33
You need to specify the aggfunc as len:
```
In [11]: df.pivot_table(index='Account_number', columns='Product', 
                        aggfunc=len, fill_value=0)
Out[11]:
Product         A  B
Account_number
1               2  0
2               1  2
3               1  1
```
It looks like count, is counting the instances of each column (Account_number and Product), it's not clear to me whether this is a bug...
0 讨论(0)
发布评论:

提交评论
- 加载中...