Reference values in the previous row with map or apply

后端未结

关注

 2  1211

别那么骄傲 2021-01-23 09:12

Given a dataframe df, I would like to generate a new variable/column for each row based on the values in the previous row. df is sorted so that the ord

2条回答

谎友^ (楼主)

2021-01-23 09:49
You can use the dataframe 'apply' function and leverage the unused the 'kwargs' parameter to store the previous row.
```
import pandas as pd

df = pd.DataFrame({'a':[0,1,2], 'b':[0,10,20]})

new_col = 'c'

def apply_func_decorator(func):
    prev_row = {}
    def wrapper(curr_row, **kwargs):
        val = func(curr_row, prev_row)
        prev_row.update(curr_row)
        prev_row[new_col] = val
        return val
    return wrapper

@apply_func_decorator
def running_total(curr_row, prev_row):
    return curr_row['a'] + curr_row['b'] + prev_row.get('c', 0)

df[new_col] = df.apply(running_total, axis=1)

print(df)
# Output will be:
#    a   b   c
# 0  0   0   0
# 1  1  10  11
# 2  2  20  33
```
This example uses a decorator to store the previous row in a dictionary and then pass it to the function when Pandas calls it on the next row.

Disclaimer 1: The 'prev_row' variable starts off empty for the first row so when using it in the apply function I had to supply a default value to avoid a 'KeyError'.

Disclaimer 2: I am fairly certain this will be slower the apply operation but I did not do any tests to figure out how much.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...