How to parallelize this for loop (or make it faster) using pandas or dask

别等时光非礼了梦想. 提交于 2020-03-23 23:56:18

问题


I want to make this loop significantly faster. It is calculating the move in a row for each feater. The function here is only applied to one column. Later, I am looping through each feature (df.columns) and applying this function.

def move_iar(df, feature):

    lst=[]
    prev_move_iar = 0

    for move in df[feature]:
        if np.isnan(move):
            move_iar = 0
            lst.append(move_iar)
            prev_move_iar = move_iar
        else:
            if move == 0:
                move_iar = prev_move_iar
                lst.append(move_iar)
                prev_move_iar = move_iar
            elif (move >= 0 and prev_move_iar >= 0) or (move <= 0 and prev_move_iar <= 0):
                move_iar = move + prev_move_iar
                lst.append(move_iar)
                prev_move_iar = move_iar
            elif (move < 0 and prev_move_iar >= 0) or (move > 0 and prev_move_iar <= 0):
                move_iar = move
                lst.append(move_iar)
                prev_move_iar = move_iar

    return pd.DataFrame(lst, index=df.index, columns=[feature]).rename(columns={feature : feature + 'IAR'})

来源:https://stackoverflow.com/questions/58192507/how-to-parallelize-this-for-loop-or-make-it-faster-using-pandas-or-dask

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!