Pandas Counting Unique Rows

谁说我不能喝 提交于 2019-11-30 05:12:43

问题


I have a pandas data frame similar to:

ColA ColB
1    1
1    1
1    1
1    2
1    2
2    1
3    2

I want an output that has the same function as Counter. I need to know how many time each row appears (with all of the columns being the same.

In this case the proper output would be:

ColA ColB Count
1    1    3
1    2    2
2    1    1
3    2    1

I have tried something of the sort:

df.groupby(['ColA','ColB']).ColA.count()

but this gives me some ugly output I am having trouble formatting


回答1:


You can use size with reset_index:

print df.groupby(['ColA','ColB']).size().reset_index(name='Count')
   ColA  ColB  Count
0     1     1      3
1     1     2      2
2     2     1      1
3     3     2      1



回答2:


I only needed to count the unique rows and have used this alternative:

len(df[['ColA','ColB']].drop_duplicates())

For this task, on my data, it was twice faster than len(df.groupby(['ColA','ColB'])) like in the above, more general solution.



来源:https://stackoverflow.com/questions/36018851/pandas-counting-unique-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!