Pandas count elements in a columns and show in duplicated way

萝らか妹 提交于 2019-12-13 02:52:03

问题


I want to get something like this.

A
1
1
2
3
3
4
4
4
4

I want to make it to be

A   B
1   2
1   2
2   1
3   2
3   2
4   4
4   4
4   4
4   4

Like you see here, the keys are duplicated and still in the same order as original.

I know how to do this task in R by using data.table and I only know how to use groupby to get unique key counts in pandas.

Anyone have ideas?

Thank you!


回答1:


You can use this:

import pandas as pd

df = pd.DataFrame({
    'A' : [1, 1, 2, 3, 3, 4, 4, 4, 4]
})
df['B'] = df.groupby(['A'])['A'].transform('count')

print(df)

output:

   A  B
0  1  2
1  1  2
2  2  1
3  3  2
4  3  2
5  4  4
6  4  4
7  4  4
8  4  4



回答2:


You could use a groupby and merge:

df = pd.DataFrame({'A' : [1, 1, 2, 3, 3, 4, 4, 4, 4]})

df = df.merge(df.groupby('A').size().reset_index(), on='A')

Which will give you:

   A  0
0  1  2
1  1  2
2  2  1
3  3  2
4  3  2
5  4  4
6  4  4
7  4  4
8  4  4



回答3:


Fast way using pd.factorize and np.bincount

f = df.A.factorize()[0]
df.assign(B=np.bincount(f)[f])

   A  B
0  1  2
1  1  2
2  2  1
3  3  2
4  3  2
5  4  4
6  4  4
7  4  4
8  4  4

Explanation

pd.factorize will create an array of integers where each integer represents a unique value in the factorized array. These integers start from zero.

f

array([0, 0, 1, 2, 2, 3, 3, 3, 3])

np.bincount will use each value in an array of integers and count how many times that integer has been seen. If we think of these integers as bins, then we are counting how many times each bin is referenced.

np.bincount(f)

array([2, 1, 2, 4])

Finally, we use f to slice these counts to give us back the counts repeated for each time the bin was referenced.

np.bincount(f)[f]

array([2, 2, 1, 2, 2, 4, 4, 4, 4])



回答4:


Using map with groupby size

df['B']=df.A.map(df.groupby('A').size())
df
Out[630]: 
   A  B
0  1  2
1  1  2
2  2  1
3  3  2
4  3  2
5  4  4
6  4  4
7  4  4
8  4  4


来源:https://stackoverflow.com/questions/49889312/pandas-count-elements-in-a-columns-and-show-in-duplicated-way

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!