This question already has an answer here:
In Matlab there exists the pdist2
command. Given the matrix mx2
and the matrix nx2
, each row of matrices represents a 2d
point. Now I want to create a mxn
matrix such that (i,j)
element represents the distance from i
th point of mx2
matrix to j
th point of nx2
matrix. I simply call the command pdist2(M,N)
.
I am looking for an alternative to this in python. I can of course write 2 for loops but since I am working with 2 numpy arrays, using for loops is not always the best choice. Is there an optimized command for this in the python universe? Basically I am asking for python alternative to MATLAB's pdist2
.
You're looking for the cdist scipy function. It will calculate the pair-wise distances (euclidean by default) between two sets of n-dimensional matrices.
from scipy.spatial.distance import cdist
import numpy as np
X = np.arange(10).reshape(-1,2)
Y = np.arange(10).reshape(-1,2)
cdist(X, Y)
[[ 0. 2.82842712 5.65685425 8.48528137 11.3137085 ] [ 2.82842712 0. 2.82842712 5.65685425 8.48528137] [ 5.65685425 2.82842712 0. 2.82842712 5.65685425] [ 8.48528137 5.65685425 2.82842712 0. 2.82842712] [ 11.3137085 8.48528137 5.65685425 2.82842712 0. ]]
You should check the pairwise_distances
method of the scikit-learn
package.
sklearn.metrics.pairwise.pairwise_distances(X, Y=None, metric='euclidean', n_jobs=1, **kwds)
More information in http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html
If your matrix is not too big, this should do without using other libs. If the matrix is big, this method will be a bit slow and memory intensive.
mx2 = np.random.randint(1,9,5)
nx2 = np.random.randint(1,9,3)
mx2
Out[308]: array([2, 3, 4, 8, 7])
nx2
Out[309]: array([3, 2, 2])
mx2[:,None]-nx2
Out[310]:
array([[-1, 0, 0],
[ 0, 1, 1],
[ 1, 2, 2],
[ 5, 6, 6],
[ 4, 5, 5]])
来源:https://stackoverflow.com/questions/43650931/python-alternative-for-calculating-pairwise-distance-between-two-sets-of-2d-poin