Can I create a new column based on when the value changes in another column?

孤者浪人 提交于 2019-12-01 08:26:29

You can use the shift-cumsum pattern.

df['C'] = (df.A != df.A.shift()).cumsum()

>>> df
              DATE_TIME  A  B  C
0  10/08/2016  12:04:56  1  5  1
1  10/08/2016  12:04:58  1  6  1
2  10/08/2016  12:04:59  2  3  2
3  10/08/2016  12:05:00  2  2  2
4  10/08/2016  12:05:01  3  4  3
5  10/08/2016  12:05:02  3  6  3
6  10/08/2016  12:05:03  1  3  4
7  10/08/2016  12:05:04  1  2  4
8  10/08/2016  12:05:05  2  4  5
9  10/08/2016  12:05:06  2  6  5
10 10/08/2016  12:05:07  3  4  6
11 10/08/2016  12:05:08  3  2  6

As a side note, this is a popular pattern for grouping. For example, to get the average B value of each such group:

df.groupby((df.A != df.A.shift()).cumsum()).B.mean()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!