euclidean-distance

Python alternative for calculating pairwise distance between two sets of 2d points [duplicate]

萝らか妹 提交于 2019-12-01 04:18:04
问题 This question already has answers here : Efficient distance calculation between N points and a reference in numpy/scipy (6 answers) Minimum Euclidean distance between points in two different Numpy arrays, not within (5 answers) Closed 2 years ago . In Matlab there exists the pdist2 command. Given the matrix mx2 and the matrix nx2 , each row of matrices represents a 2d point. Now I want to create a mxn matrix such that (i,j) element represents the distance from i th point of mx2 matrix to j th

calculating the euclidean dist between each row of a dataframe with all other rows in another dataframe

北城以北 提交于 2019-12-01 03:53:37
I need to generate a dataframe with minimum euclidean distance between each row of a dataframe and all other rows of another dataframe.Both my dataframes are large (approx 40,000 rows).This is what I could work out till now. x<-matrix(c(3,6,3,4,8),nrow=5,ncol=7,byrow = TRUE) y<-matrix(c(1,4,4,1,9),nrow=5,ncol=7,byrow = TRUE) sed.dist<-numeric(5) for (i in 1:(length(sed.dist))) { sed.dist[i]<-(sqrt(sum((y[i,1:7] - x[i,1:7])^2))) } But this only works when i=j.What I essentially need is to find the minimum euclidean distance by looping over every row one by one ( y[1,1:7],then y[2,1:7] and so on

Find euclidean distance from a point to rows in pandas dataframe

半腔热情 提交于 2019-12-01 02:12:06
问题 i have a dataframe id lat long 1 12.654 15.50 2 14.364 25.51 3 17.636 32.53 5 12.334 25.84 9 32.224 15.74 I want to find the euclidean distance of these coordinates from a particulat location saved in a list L1 L1 = [11.344,7.234] i want to create a new column in df where i have the distances id lat long distance 1 12.654 15.50 2 14.364 25.51 3 17.636 32.53 5 12.334 25.84 9 32.224 15.74 i know to find euclidean distance between two points using math.hypot(): dist = math.hypot(x2 - x1, y2 - y1

Efficient and precise calculation of the euclidean distance

帅比萌擦擦* 提交于 2019-12-01 02:00:32
问题 Following some online research (1, 2, numpy, scipy, scikit, math), I have found several ways for calculating the Euclidean Distance in Python : # 1 numpy.linalg.norm(a-b) # 2 distance.euclidean(vector1, vector2) # 3 sklearn.metrics.pairwise.euclidean_distances # 4 sqrt((xa-xb)^2 + (ya-yb)^2 + (za-zb)^2) # 5 dist = [(a - b)**2 for a, b in zip(vector1, vector2)] dist = math.sqrt(sum(dist)) # 6 math.hypot(x, y) I was wondering if someone could provide an insight on which of the above ( or any

Calculate the euclidean distance in scipy csr matrix

谁说胖子不能爱 提交于 2019-11-30 23:27:50
I need to calculate the Euclidean Distance between all points that is stored in csr sparse matrix and some lists of points. It would be easier for me to convert the csr to a dense one, but I couldn't due to the lack of memory, so I need to keep it as csr. So for example I have this data_csr sparse matrix (view in both, csr and dense): data_csr (0, 2) 4 (1, 0) 1 (1, 4) 2 (2, 0) 2 (2, 3) 1 (3, 5) 1 (4, 0) 4 (4, 2) 3 (4, 3) 2 data_csr.todense() [[0, 0, 4, 0, 0, 0] [1, 0, 0, 0, 2, 0] [2, 0, 0, 1, 0, 0] [0, 0, 0, 0, 0, 1] [4, 0, 3, 2, 0, 0]] and this center lists of points: center array([[0, 1, 2,

Closest pair for any of a huge number of points

自古美人都是妖i 提交于 2019-11-30 09:40:23
问题 We are given a huge set of points in 2D plane. We need to find, for each point the closest point within the set. For instance suppose the initial set is as follows: foo <- data.frame(x=c(1,2,4,4,10),y=c(1,2,4,4,10)) The output should be like this: ClosesPair(foo) 2 1 4 3 3 # (could be 4 also) Any idea? 回答1: The traditional approach is to preprocess the data and put it in a data structure, often a K-d tree, for which the "nearest point" query is very fast. There is an implementation in the

Is “norm” equivalent to “Euclidean distance”?

我是研究僧i 提交于 2019-11-30 06:30:50
问题 I am not sure whether "norm" and "Euclidean distance" mean the same thing. Please could you help me with this distinction. I have an n by m array a , where m > 3. I want to calculate the Eculidean distance between the second data point a[1,:] to all the other points (including itself). So I used the np.linalg.norm , which outputs the norm of two given points. But I don't know if this is the right way of getting the EDs. import numpy as np a = np.array([[0, 0, 0 ,0 ], [1, 1 , 1, 1],[2,2, 2, 3]

Identifying points with the smallest Euclidean distance

北战南征 提交于 2019-11-29 18:42:51
问题 I have a collection of n dimensional points and I want to find which 2 are the closest. The best I could come up for 2 dimensions is: from numpy import * myArr = array( [[1, 2], [3, 4], [5, 6], [7, 8]] ) n = myArr.shape[0] cross = [[sum( ( myArr[i] - myArr[j] ) ** 2 ), i, j] for i in xrange( n ) for j in xrange( n ) if i != j ] print min( cross ) which gives [8, 0, 1] But this is too slow for large arrays. What kind of optimisation can I apply to it? RELATED: Euclidean distance between points

Closest pair for any of a huge number of points

久未见 提交于 2019-11-29 17:18:20
We are given a huge set of points in 2D plane. We need to find, for each point the closest point within the set. For instance suppose the initial set is as follows: foo <- data.frame(x=c(1,2,4,4,10),y=c(1,2,4,4,10)) The output should be like this: ClosesPair(foo) 2 1 4 3 3 # (could be 4 also) Any idea? The traditional approach is to preprocess the data and put it in a data structure, often a K-d tree , for which the "nearest point" query is very fast. There is an implementation in the nnclust package. library(nnclust) foo <- cbind(x=c(1,2,4,4,10),y=c(1,2,4,4,10)) i <- nnfind(foo)$neighbour

What is the most efficient way to compute the square euclidean distance between N samples and clusters centroids?

淺唱寂寞╮ 提交于 2019-11-29 15:48:18
I am looking for an efficient way ( no for loops ) to compute the euclidean distance between a set of samples and a set of clusters centroids. Example: import numpy as np X = np.array([[1,2,3],[1, 1, 1],[0, 2, 0]]) y = np.array([[1,2,3], [0, 1, 0]]) Expected output: array([[ 0., 11.], [ 5., 2.], [10., 1.]]) This is the squared euclidean distance between each sample in X to each centroid in y. I came up with 2 solutions: Solution 1 : def dist_2(X,y): X_square_sum = np.sum(np.square(X), axis = 1) y_square_sum = np.sum(np.square(y), axis = 1) dot_xy = np.dot(X, y.T) X_square_sum_tile = np.tile(X