问题
Given this dataframe:
import pandas as pd
a=pd.DataFrame({'number':[2,2,3],'A':['abc','def','ghi']})
a
A number
0 abc 2
1 def 2
2 ghi 3
I need to concatenate values, in order of index, from rows with the same number value, separated by '; '.
Desired result:
A number
0 abc; def 2; 2
2 ghi 3
So far, I thought I could isolate the dataframes and then somehow try to join them together like this:
a['rank']=a.groupby('number').rank()
a1=a.loc[a['rank']==1]
a2=a.loc[a['rank']==2]
b=a1.merge(a2,on='number',how='left')
b=b.fillna('')
b
A_x number rank_x A_y rank_y
0 abc 2 1.0 def 2
1 ghi 3 1.0
..and then it's just a matter of something like this per column:
b['A'] = b['A_x']+'; '+b['A_y']
...but is there a more concise way to do this (perhaps for all columns at once)?
Thanks in advance!
回答1:
Use groupby + agg -
a.astype(str).groupby(a.number, as_index=False).agg('; '.join)
A number
0 abc; def 2; 2
1 ghi 3
Thanks to MaxU for the tune-up!
回答2:
You need a new para to help then groupby + agg+ join
a.assign(number2=a.number).groupby('number2').agg(lambda x : ';'.join(x.astype(str)))
Out[238]:
A number
number2
2 abc;def 2;2
3 ghi 3
来源:https://stackoverflow.com/questions/48159340/pandas-concatenate-values-from-different-rows