问题
I have 2 pandas dataframes which looks like below.
Data Frame 1:
Section Chainage Frame
R125R002 10.133 1
R125R002 10.138 2
R125R002 10.143 3
R125R002 10.148 4
R125R002 10.153 5
Data Frame 2:
Section Chainage 1 2 3 4 5 6 7 8
R125R002 10.133 0 0 1 0 0 0 0 0
R125R002 10.134 0 0 1 0 0 0 0 0
R125R002 10.135 0 0 1 0 0 0 0 0
R125R002 10.136 0 0 1 0 0 0 0 0
R125R002 10.137 0 0 1 0 0 0 0 0
R125R002 10.138 0 0 1 0 0 0 0 0
R125R002 10.139 0 0 1 0 0 0 0 0
R125R002 10.14 0 0 1 0 0 0 0 0
R125R002 10.141 0 0 1 0 0 0 0 0
R125R002 10.142 0 0 1 0 0 0 0 0
R125R002 10.143 0 0 1 0 0 0 0 0
R125R002 10.144 0 0 1 0 0 0 0 0
R125R002 10.145 0 0 1 0 0 0 0 0
R125R002 10.146 0 0 1 0 0 0 0 0
R125R002 10.147 0 0 1 0 0 0 0 0
R125R002 10.148 0 0 1 0 0 0 0 0
R125R002 10.149 0 0 1 0 0 0 0 0
R125R002 10.15 0 0 1 0 0 0 0 0
R125R002 10.151 0 0 1 0 0 0 0 0
R125R002 10.152 0 0 1 0 0 0 0 0
R125R002 10.153 0 0 1 0 0 0 0 0
required output dataframe:
Section Chainage Frame 1 2 3 4 5 6 7 8
R125R002 10.133 1 0 0 1 0 0 0 0 0
R125R002 10.138 2 0 0 1 0 0 0 0 0
R125R002 10.143 3 0 0 1 0 0 0 0 0
R125R002 10.148 4 0 0 1 0 0 0 0 0
R125R002 10.153 5 0 0 1 0 0 0 0 0
Dataframe 2 has increment of 1 m interval while dataframe 1 has increment of 5 m. I would like merge dataframe 2 to dataframe 1 and apply group by. Groupby for column 1 is sum, column 2 max, colum3 to 8 average.
In sql, I would link section between between 2 frames and apply between condition for the chainage and then add groupby.
Is there any way to achieve this in pandas.
回答1:
You can first aggregate by each 5 rows with define functions in dictionary:
d = {'Section':'first','Chainage':'first','1':'sum','2':'max', '8':'mean'}
df22 = df2.groupby([np.arange(len(df2.index)) // 5], as_index=False).agg(d)
print (df22)
Section Chainage 1 2 8
0 R125R002 10.133 0 0 0
1 R125R002 10.138 0 0 0
2 R125R002 10.143 0 0 0
3 R125R002 10.148 0 0 0
4 R125R002 10.153 0 0 0
Detail:
print (np.arange(len(df2.index)) // 5)
[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4]
And then need merge:
df = df1.merge(df22, on=['Section','Chainage'])
print (df)
Section Chainage Frame 1 2 8
0 R125R002 10.133 1 0 0 0
1 R125R002 10.138 2 0 0 0
2 R125R002 10.143 3 0 0 0
3 R125R002 10.148 4 0 0 0
4 R125R002 10.153 5 0 0 0
来源:https://stackoverflow.com/questions/50580408/pandas-merge-and-grouby