how to shift columns in pandas DataFrame dynamically and independently?

狂风中的少年 提交于 2021-02-11 07:06:54

问题


What I wish to achieve is to be able to normalise my values so that they all start on a arbitrary day-zero. Imagine the DataFrame below:

df = pd.DataFrame(np.array([[1, 0, 0], [4, 5, 0], [7, 8, 9], [7, 8, 9], [4, 5, 0], [7, 8, 9]]), 
              columns=['a', 'b', 'c'], 
              index = ['1/1/2000', '1/1/2001', '1/1/2002', '1/1/2003', '1/1/2004', '1/1/2005'])

Which results in the following:

            a   b   c
1/1/2000    1   0   0
1/1/2001    4   5   0
1/1/2002    7   8   9
1/1/2003    7   8   9
1/1/2004    4   5   0
1/1/2005    7   8   9    

What I am trying to achieve is the find the first non-zero value in each column, and shift the values in that column to the index zero. so the final dataFrame looks similar to this:

            a   b   c
1/1/2000    1   5   9
1/1/2001    4   8   9
1/1/2002    7   8   0
1/1/2003    7   5   9
1/1/2004    4   8   0
1/1/2005    7   0   0 

I tried to use a shift, and I can use something like this:

df.b = df.b.shift(periods = -1, fill_value = 0)

but currently I do not know any pandas-friendly way to do this without using for loops, etc.

Appreciate if some can help on the best to find the first non-zero value and then shifting the values correctly to index zero.


回答1:


First get number of shifts by compare for not equal by DataFrame.ne, then add cumulative sum by DataFrame.cumsum, compare 0 and last count Trues by sum. Then use DataFrame.apply with lambda function and for shifts select values of Series:

s = df.ne(0).cumsum().eq(0).sum()
df = df.apply(lambda x: x.shift(periods = -s[x.name], fill_value = 0))
print (df)
          a  b  c
1/1/2000  1  5  9
1/1/2001  4  8  9
1/1/2002  7  8  0
1/1/2003  7  5  9
1/1/2004  4  8  0
1/1/2005  7  0  0


来源:https://stackoverflow.com/questions/60558581/how-to-shift-columns-in-pandas-dataframe-dynamically-and-independently

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!