Hierarchical data: efficiently build a list of every descendant for each node

前端 未结 4 779
闹比i
闹比i 2020-12-24 04:26

I have a two column data set depicting multiple child-parent relationships that form a large tree. I would like to use this to build an updated list of every descendant for

4条回答
  •  离开以前
    2020-12-24 05:15

    Here is one way using isin() and map

    df_new = df.append(df[df['parent'].isin(df['child'].values.tolist())])\
    .reset_index(drop = True)
    
    df_new.loc[df_new.duplicated(), 'parent'] = df_new.loc[df_new.duplicated(), 'parent']\
    .map(df.set_index('child')['parent'])
    
    df_new = df_new.sort_values('parent').reset_index(drop=True)
    df_new.columns = [' descendant' , 'ancestor']
    

    You get

        descendant  ancestor
    0   2010    1000
    1   2100    1000
    2   2110    1000
    3   3000    1000
    4   3011    1000
    5   3033    1000
    6   3102    1000
    7   3111    1000
    8   3011    2010
    9   3102    2010
    10  3033    2100
    11  3000    2110
    12  3111    2110
    

提交回复
热议问题