问题:

I have a point set which I have stored its coordinates in three different arrays (xa, ya, za). Now, I want to calculate the euclidean distance between each point of this point set (xa[0], ya[0], za[0] and so on) with all the points of an another point set (xb, yb, zb) and every time store the minimum distance in a new array.

Let's say that xa.shape = (11,), ya.shape = (11,), za.shape= (11,). Respectively, xb.shape = (13,), yb.shape = (13,), zb.shape = (13,). What I want to do is to take each time one xa[],ya[],za[], and calculate its distance with all the elements of xb, yb, zb, and at the end store the minimum value into an xfinal.shape = (11,) array.

Do you think that this would be possible with numpy?

回答1:

A different solution would be to use the spatial module from scipy, the KDTree in particular.

This class learn from a set of data and can be interrogated given a new dataset:

from scipy.spatial import KDTree # create some fake data x = arange(20) y = rand(20) z = x**2 # put them togheter, should have a form [n_points, n_dimension] data = np.vstack([x, y, z]).T # create the KDTree kd = KDTree(data)

now if you have a point you can ask the distance and the index of the closet point (or the N closest points) simply by doing:

kd.query([1, 2, 3]) # (1.8650720813822905, 2) # your may differs

or, given an array of positions:

#bogus position x2 = rand(20)*20 y2 = rand(20)*20 z2 = rand(20)*20 # join them togheter as the input data2 = np.vstack([x2, y2, z2]).T #query them kd.query(data2)  #(array([ 14.96118553,   9.15924813,  16.08269197,  21.50037074, #    18.14665096,  13.81840533,  17.464429  ,  13.29368755, #    20.22427196,   9.95286671,   5.326888  ,  17.00112683, #     3.66931946,  20.370496  ,  13.4808055 ,  11.92078034, #     5.58668204,  20.20004206,   5.41354322,   4.25145521]), #array([4, 3, 2, 4, 2, 2, 4, 2, 3, 3, 2, 3, 4, 4, 3, 3, 3, 4, 4, 4]))

回答2:

You can calculate the difference from each xa to each xb with np.subtract.outer(xa, xb). The distance to the nearest xb is given by

np.min(np.abs(np.subtract.outer(xa, xb)), axis=1)

To extend this to 3D,

distances = np.sqrt(np.subtract.outer(xa, xb)**2 + \     np.subtract.outer(ya, yb)**2 + np.subtract.outer(za, zb)**2) distance_to_nearest = np.min(distances, axis=1)

If you actually want to know which of the b points is the nearest, you use argmin in place of min.

index_of_nearest = np.argmin(distances, axis=1)

回答3:

There is more than one way of doing this. Most importantly, there's a trade-off between memory-usage and speed. Here's the wasteful method:

s = (1, -1) d = min((xa.reshape(s)-xb.reshape(s).T)**2      + (ya.reshape(s)-yb.reshape(s).T)**2      + (za.reshape(s)-zb.reshape(s).T)**2), axis=0)

The other method would be to iterate over the point set in b to avoid the expansion to the full blown matrix.

转载请标明出处:Calculate euclidean distance with numpy

文章来源: Calculate euclidean distance with numpy

标签

array