numpy python: vectorize distance function to calculate pairwise distance of 2 matrix with a dimension of (m, 3)

只愿长相守 提交于 2020-03-22 09:23:18

问题


I have two numpy arrays A and B. Shape of A is (m,3) and shape of B is (n, 3).

Those matrix look like this:

A
#output
array([[  9.227,  -4.698, -95.607],
   [ 10.294,  -4.659, -94.606],
   [ 11.184,  -5.906, -94.675],
   ...,
   [ 19.538, -91.572, -45.361],
   [ 20.001, -92.655, -45.009],
   [ 19.271, -92.726, -45.79 ]])

So it contains for each row the coordinates x,y,z of a 3D point. B follows the same format.

I have this function (np is numpy):

def compute_dist(point1, point2):
    squared = (point1-point2)**2
    return (np.sqrt(np.sum(squares)))

I want to compute a pairwise distance between A and B by using a vectorized function.

I try this:

 v = np.vectorize(compute_dist)
 v(A, B)
 #output
 matrix([[37.442, 42.693, 72.705],
    [37.442, 42.693, 72.705],
    [37.442, 42.693, 72.705],
    ...,
    [37.442, 42.693, 72.705],
    [37.442, 42.693, 72.705],
    [37.442, 42.693, 72.705]])

I don't understand how to use vectorize even if I read the doc. How can I compute a matrix which contains pairwise distance between A and B? I know there is scipy.distance.cdist but I want to do it myself with np.vectorize.

I don't care about the format of the output (list, array, matrix ...). At the end I just want to find the minimal distance.


回答1:


You can use np.newaxis to expand the dimensions of your two arrays A and B to enable broadcasting and then do your calculations.

Pairwise distance means every point in A (m, 3) should be compared to every point in B (n, 3). This results in a (m, n) matrix of distances. With numpy one can use broadcasting to achieve the wanted result. By using A=A[:, np.newaxis, :] and B=B[np.newaxis, :, :] the resulting shapes are A (m, 1, 3) and B(1, n, 3) respectivley. If you then perform a calculation like C = A-B numpy automatically broadcasts. This means you get a copy of all m rows of A for all n columns of B and a copy of all n columns of B for all m rows of A.

  A (m, 1, 3)
- B (1, n, 3)
--------------
= C (m, n, 3)

To get the distance matrix you can than use numpy.linalg.norm():

import numpy as np
m = 10
n = 12
A = np.random.random((m, 3))
B = np.random.random((n, 3))

# Add newaxis on seconbd axis of A and on first axis on B
# shape: (m, n, 3) = (m, 1, 3) - (1, n, 3)
C = A[:, np.newaxis, :] - B[np.newaxis, :, :]

C = np.linalg.norm(C, axis=-1)
# shape: (m, n)


来源:https://stackoverflow.com/questions/60039982/numpy-python-vectorize-distance-function-to-calculate-pairwise-distance-of-2-ma

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!