Constructing a co-occurrence matrix in python pandas

后端 未结 4 1664
南笙
南笙 2020-11-28 04:29

I know how to do this in R. But, is there any function in pandas that transforms a dataframe to an nxn co-occurrence matrix containing the counts of two aspects co-occurring

4条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-28 04:55

    It's a simple linear algebra, you multiply matrix with its transpose (your example contains strings, don't forget to convert them to integer):

    >>> df_asint = df.astype(int)
    >>> coocc = df_asint.T.dot(df_asint)
    >>> coocc
           Dop  Snack  Trans
    Dop      4      2      3
    Snack    2      3      2
    Trans    3      2      4
    

    if, as in R answer, you want to reset diagonal, you can use numpy's fill_diagonal:

    >>> import numpy as np
    >>> np.fill_diagonal(coocc.values, 0)
    >>> coocc
           Dop  Snack  Trans
    Dop      0      2      3
    Snack    2      0      2
    Trans    3      2      0
    

提交回复
热议问题