Pandas - Find and index rows that match row sequence pattern

前端 未结 5 1604
半阙折子戏
半阙折子戏 2020-11-27 21:26

I would like to find a pattern in a dataframe in a categorical variable going down rows. I can see how to use Series.shift() to look up / down and using boolean logic to fi

5条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-27 22:19

    You can do this by defining a custom aggregate function, then using it in group_by statement, finally merge it back to the original dataframe. Something like this:

    Aggregate function:

    def pattern_detect(column):
     # define any other pattern to detect here
     p0, p1, p2, p3 = 1, 2, 2, 0       
     column.eq(p0) & \
     column.shift(-1).eq(p1) & \
     column.shift(-2).eq(p2) & \
     column.shift(-3).eq(p3)
     return column.any()
    

    Use group by function next:

    grp = df.group_by('group_var').agg([patter_detect])['row_pat']
    

    Now merge it back to the original dataframe:

    df = df.merge(grp, left_on='group_var',right_index=True, how='left')
    

提交回复
热议问题