Computing np.diff in Pandas after using groupby leads to unexpected result

后端未结

关注

 2  1986

半阙折子戏 2020-11-30 03:32

I\'ve got a dataframe, and I\'m trying to append a column of sequential differences to it. I have found a method that I like a lot (and generalizes well for my use case).

2条回答

攒了一身酷 (楼主)

2020-11-30 03:37

You can see that the Series .diff() method is different to np.diff():

In [11]: data.value.diff()  # Note the NaN
Out[11]: 
0         NaN
1   -0.410069
2    0.523736
3   -0.114340
4   -0.014955
5   -0.090033
6   -0.125686
7    0.414622
8   -0.319616
Name: value, dtype: float64

In [12]: np.diff(data.value.values)  # the values array of the column
Out[12]: 
array([-0.41006867,  0.52373625, -0.11434009, -0.01495459, -0.09003298,
       -0.12568619,  0.41462233, -0.31961629])

In [13]: np.diff(data.value) # on the column (Series)
Out[13]: 
0   NaN
1     0
2     0
3     0
4     0
5     0
6     0
7     0
8   NaN
Name: value, dtype: float64

In [14]: np.diff(data.value.index)  # er... on the index
Out[14]: Int64Index([8], dtype=int64)

In [15]: np.diff(data.value.index.values)
Out[15]: array([1, 1, 1, 1, 1, 1, 1, 1])

0 讨论(0)

查看其它2个回答