Sorting Pandas dataframe data within Groupby groups

半腔热情 提交于 2021-01-28 15:31:28

问题


I have a large pandas dataframe that can be represented structurally as:

      id          date    status
0     12    2015-05-01         0
1     12    2015-05-22         1
2     12    2015-05-14         1
3     12    2015-05-06         0
4     45    2015-05-03         1
5     45    2015-05-12         1
6     45    2015-05-02         0
7     51    2015-05-05         1
8     51    2015-05-01         0
9     51    2015-05-23         1
10    51    2015-05-17         1
11    51    2015-05-03         0
12    51    2015-05-05         0
13    76    2015-05-04         1
14    76    2015-05-22         1
15    76    2015-05-08         0

And can be created in Python 3.4 using:

tempDF = pd.DataFrame({ 'id': [12,12,12,12,45,45,45,51,51,51,51,51,51,76,76,76],
                        'date': ['2015-05-01','2015-05-22','2015-05-14','2015-05-06','2015-05-03','2015-05-12','2015-05-02','2015-05-05','2015-05-01','2015-05-23','2015-05-17','2015-05-03','2015-05-05','2015-05-04','2015-05-22','2015-05-08'],
                        'status': [0,1,1,0,1,1,0,1,0,1,1,0,0,1,1,0]})
tempDF['date'] = pd.to_datetime(tempDF['date'])

I would like to divide the dataframe into groups based on variable 'id', sort within groups based on 'date' and then get the last 'status' value within each group.

So far, I have:

tempGrouped = tempDF.groupby('id')
tempGrouped['status'].last()

which produces:

id
12    0
45    0
51    0
76    0

However, the status should be 1 in each case (the value associated with the latest date). I can't work out how to sort the groups by date before selecting the last value. It's likely I'm a little snow-blind after trying to work this out for a while, so I apologise in advance if the solution is obvious.


回答1:


you can sort and group like this :

tempDF.sort(['id','date']).groupby('id')['status'].last()



回答2:


you can sort and group like this: (fix the sort method above)

tempDF.sort_values(['id','date']).groupby('id')['status'].last()


来源:https://stackoverflow.com/questions/31121344/sorting-pandas-dataframe-data-within-groupby-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!