Writing a function to get the sums of columns C/D the last time columns A/B are a specific value?

戏子无情 提交于 2020-05-17 08:45:28

问题


I have a datasheet with sports results. They are labeled something like this, where column A is the home team, column B is the away team, column C is the home score, column D is the away score, and column E is the final result. Also has a date column which I've left off for the purpose of this, but it is there.

PIT   PHI   4   5   Away
PIT   BOS   3   5   Away
BOS   SJS   3   2   Home
SJS   PHI   1   1   Draw
PIT   SJS   3   2   Home
PHI   BOS   4   3   Home

What I would like to do is add two columns to this dataframe. The first should have the sum of goals scored for the home team in their last 3 games (all games, not just home games) - but not including the results from the current row. The second should have the sum of goals scored for the away team in their last 3 games (all games, not just away games) - but not including the results from the current row.

So let's say the next row in this sheet has: BOS as the home team, PIT as the home team. In their 3 most recent games PRIOR to this one, BOS have scored 11 goals. In their 3 most recent games PRIOR to this one, PIT have scored 10 goals. So assuming the game finishes 5-5 (or whatever the result happens to be), the row should look like this with the two added columns.

BOS   PIT   5   5   Draw   11   10

There are a couple things making this difficult for me.

In finding the last 3 times a value (let's say "BOS") appears in the dataframe, I don't know how to make it clear that it can be in either column A or B. And I also don't know how to specify that the value should be added from column C when BOS is in column A, and from column D when BOS is in column B.

I want to do this WITHOUT transposing the dataset so that every team has their own line. Ie I DO NOT want:

BOS   5   Draw   11
PIT   5   Draw   10

The original dataset needs its formatting kept.

Finally, I'm also not clear on how I'd get this added to rows while NOT including the current row in the sum. Is that just using shift() somehow?

Thanks a lot in advance.

来源:https://stackoverflow.com/questions/61416935/writing-a-function-to-get-the-sums-of-columns-c-d-the-last-time-columns-a-b-are

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!