问题
I have a dataset like this:
>>> df = pd.DataFrame({'id_sin':['s123','s123','s124','s124'],
'raison':['first problem','second problem','album','dog']
})
>>> df
id_sin raison
0 s123 first problem
1 s123 second problem
2 s124 album
3 s124 dog
This is the expected output:
id_sin raison
0 s123 first problem, second problem
1 s124 album, dog
What I tried:
df['raison'] = df.groupby('id_sin')['raison'].apply(lambda x: ', '.join(x))
But doesn't work... what am I missing? Thanks for help!
回答1:
Try using agg
:
df.groupby('id_sin')['raison'].agg(', '.join).reset_index()
Output:
id_sin raison
0 s123 first problem, second problem
1 s124 album, dog
回答2:
Try changing the groups to lists:
df.groupby(['id_sin']).raison.apply(lambda x: ', '.join(list(x)))
After testing your code, it turns out that you should not do df['raison'] =...
because df.groupby('id_sin')['raison'].apply(lambda x: ', '.join(x))
has length 2 with different index than df
, which has length 4.
来源:https://stackoverflow.com/questions/55697144/join-groupby-column-with-a-comma-in-a-pandas-dataframe