Number of unique pairs within one column - pandas

試著忘記壹切 提交于 2019-12-07 18:11:26

问题


I am having a little problem with producing statistics for my dataframe in pandas. My dataframe looks like this (I omit the index):

id    type  
1      A
2      B
3      A
1      B
3      B
2      C
4      B
4      C

What is important, each id has two type values assigned, as can be seen from the example above. I want to count all type combinations occurrences (so count number of unique id with given type combination), so I want to get such a dataframe:

type    count
A, B      2
A, C      0
B, C      2

I tried using groupby in many ways, but in vain. I can do this kind of 'count' using for-loop and a number of lines of code, but I believe there has to be elegant and proper (in python terms) solution to this problem.

Thanks in advance for any hints.


回答1:


Using GroupBy + apply with value_counts:

from itertools import combinations

def combs(types):
    return pd.Series(list(combinations(sorted(types), 2)))

res = df.groupby('id')['type'].apply(combs).value_counts()

print(res)

(A, B)    2
(B, C)    2
Name: type, dtype: int64



回答2:


pd.value_counts and itertools.combinations

from itertools import combinations

pd.value_counts(
    [(x, y) for _, d in df.groupby('id') for x, y in combinations(d.type, 2)]
)

(A, B)    2
(B, C)    2
dtype: int64



回答3:


Using Counter, groupby and the default constructor

from collections import Counter
>>> pd.DataFrame(Counter([tuple(v.type.values) for _,v in df.groupby('id')]), index=['Count']).T

        Count
A   B   2
B   C   2



回答4:


Maybe using unique, notice only good for two unique value within one id

df.groupby('id').type.unique().apply(tuple).value_counts()
Out[202]: 
(A, B)    2
(B, C)    2
Name: type, dtype: int64


来源:https://stackoverflow.com/questions/53159144/number-of-unique-pairs-within-one-column-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!