Unique values within Pandas group of groups

北战南征 提交于 2019-12-12 10:53:41

问题


I have a dataframe that I need to group, then subgroup. From the subgroups I need to return what the subgroup is as well as the unique values for a column.

df = pandas.DataFrame({'country': pandas.Series(['US', 'Canada', 'US', 'US']),
                       'gender': pandas.Series(['male', 'female', 'male', 'female']),
                       'industry': pandas.Series(['real estate', 'shipping', 'telecom', 'real estate']),
                       'income': pandas.Series([1, 2, 3, 4])})

def subgroup(g):
    return g.groupby(['gender'])

s = df.groupby(['country']).apply(subgroup)

From s, how can I compute the uniques of "industry" as well as which "gender" it's grouped for?

--------------------------------------------
| US     | male   | [real estate, telecom] |
|        |----------------------------------
|        | female | [real estate]          |
--------------------------------------------
| Canada | female | [shipping]             |
--------------------------------------------

回答1:


you dont need to define that function, you can solve your problem with groupby() and unique() solely;

try:

df.groupby(['country','gender'])['industry'].unique()

output:

country  gender
Canada   female                [shipping]
US       female             [real estate]
         male      [real estate, telecom]
Name: industry, dtype: object

hope it helps!



来源:https://stackoverflow.com/questions/41880388/unique-values-within-pandas-group-of-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!