Cartesian product of a pandas dataframe with itself

后端 未结 3 1657
有刺的猬
有刺的猬 2020-12-10 19:59

Given a dataframe:

    id  value
0    1     a
1    2     b
2    3     c

I want to get a new dataframe that is basically the cartesian produ

3条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-10 20:44

    We want to get the indices for the upper and lower triangles of a square matrix. Or in other words, where the identity matrix is zero

    np.eye(len(df))
    
    array([[ 1.,  0.,  0.],
           [ 0.,  1.,  0.],
           [ 0.,  0.,  1.]])
    

    So I subtract it from 1 and

    array([[ 0.,  1.,  1.],
           [ 1.,  0.,  1.],
           [ 1.,  1.,  0.]])
    

    In a boolean context and passed to np.where I get exactly the upper and lower triangle indices.

    i, j = np.where(1 - np.eye(len(df)))
    df.iloc[i].reset_index(drop=True).join(
        df.iloc[j].reset_index(drop=True), rsuffix='_2')
    
       id value  id_2 value_2
    0   1     a     2       b
    1   1     a     3       c
    2   2     b     1       a
    3   2     b     3       c
    4   3     c     1       a
    5   3     c     2       b
    

提交回复
热议问题