Python - How to generate the Pairwise Hamming Distance Matrix

后端 未结 2 646
既然无缘
既然无缘 2021-01-23 06:19

beginner with Python here. So I\'m having trouble trying to calculate the resulting binary pairwise hammington distance matrix between the rows of an input matrix using only th

2条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-23 06:47

    Try this approach, create a new axis along axis = 1, and then do broadcasting and count trues or non zero with sum:

    (arr[:, None, :] != arr).sum(2)
    
    # array([[0, 2, 3],
    #        [2, 0, 3],
    #        [3, 3, 0]])
    

    def compute_HammingDistance(X):
        return (X[:, None, :] != X).sum(2)
    

    Explanation:

    1) Create a 3d array which has shape (3,1,6)

    arr[:, None, :]
    #array([[[1, 0, 0, 1, 1, 0]],
    #       [[1, 0, 0, 0, 0, 0]],
    #       [[1, 1, 1, 1, 0, 0]]])
    

    2) this is a 2d array has shape (3, 6)

    arr   
    #array([[1, 0, 0, 1, 1, 0],
    #       [1, 0, 0, 0, 0, 0],
    #       [1, 1, 1, 1, 0, 0]])
    

    3) This triggers broadcasting since their shape doesn't match, and the 2d array arr is firstly broadcasted along the 0 axis of 3d array arr[:, None, :], and then we have array of shape (1, 6) be broadcasted against (3, 6). The two broadcasting steps together make a cartesian comparison of the original array.

    arr[:, None, :] != arr 
    #array([[[False, False, False, False, False, False],
    #        [False, False, False,  True,  True, False],
    #        [False,  True,  True, False,  True, False]],
    #       [[False, False, False,  True,  True, False],
    #        [False, False, False, False, False, False],
    #        [False,  True,  True,  True, False, False]],
    #       [[False,  True,  True, False,  True, False],
    #        [False,  True,  True,  True, False, False],
    #        [False, False, False, False, False, False]]], dtype=bool)
    

    4) the sum along the third axis count how many elements are not equal, i.e, trues which gives the hamming distance.

提交回复
热议问题