printing the top 2 of frequently occurred values of the target column

一个人想着一个人 提交于 2019-12-10 22:15:38

问题


I have three columns like shown below, and trying to return top1 and top2 highest count of the third column. I want this output to be generated as shown in the expected output . DATA :

print (df)

   AGE GENDER rating
0   10      M     PG
1   10      M      R
2   10      M      R
3    4      F   PG13
4    4      F   PG13

CODE :

 s = (df.groupby(['AGE', 'GENDER'])['rating']
       .apply(lambda x: x.value_counts().head(2))
       .rename_axis(('a','b', 'c'))
       .reset_index(level=2)['c'])

output :

print (s)

a   b
4   F    PG13
10  M       R
    M      PG
Name: c, dtype: object

EXPECTED OUTPUT :

print (s[F])
('PG13')

print(s[M])

('PG13', 'R')

回答1:


I think you need:

s = (df.groupby(['AGE', 'GENDER'])['rating']
       .apply(lambda x: x.value_counts().head(2))
       .rename_axis(('a','b', 'c'))
       .reset_index()
       .groupby('b')['c']
       .apply(list)
       .to_dict()
       )
print (s)
{'M': ['R', 'PG'], 'F': ['PG13']}


来源:https://stackoverflow.com/questions/48768632/printing-the-top-2-of-frequently-occurred-values-of-the-target-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!