Cartesian product of a pandas dataframe with itself

后端 未结 3 1658
有刺的猬
有刺的猬 2020-12-10 19:59

Given a dataframe:

    id  value
0    1     a
1    2     b
2    3     c

I want to get a new dataframe that is basically the cartesian produ

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-10 20:42

    This can be done entirely in pandas:

    df.loc[:, 'key_col'] = 1 # create a join column that will give us the Cartesian Product
    
    (df.merge(df, df, on='key_col', suffixes=('', '_2'))
     .query('id != id_2') # filter out joins on the same row
     .drop('key_col', axis=1)
     .reset_index(drop=True))
    

    Or if you don't want to have to drop the dummy column, you can temporarily create it when calling df.merge:

    (df.merge(df, on=df.assign(key_col=1)['key_col'], suffixes=('', '_2'))
     .query('id != id_2') # filter out joins on the same row
     .reset_index(drop=True))
    

提交回复
热议问题