How to Build a Distance Matrix without a Loop (Vectorization)?

心已入冬 提交于 2019-12-18 06:59:21

问题


I have many points and I want to build distance matrix i.e. distance of every point with all of other points but I want to don't use from loop because take too time... Is a better way for building this matrix? this is my loop: for a setl with size: 10000x3 this method take a lot of my time :(

 for i=1:size(setl,1)
     for j=1:size(setl,1)        
         dist = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+...
             (zl(i)-zl(j))^2);
         distanceMatrix(i,j) = dist;
     end
 end

回答1:


How about using some linear algebra? The distance of two points can be computed from the inner product of their position vectors,

D(x, y) = ∥y – x∥ = √ ( xT x + yT y – 2 xT y ),

and the inner product for all pairs of points can be obtained through a simple matrix operation.

x = [xl(:)'; yl(:)'; zl(:)'];
IP = x' * x;
d = sqrt(bsxfun(@plus, diag(IP), diag(IP)') - 2 * IP);

For 10000 points, I get the following timing results:

  • ahmad's loop + shoelzer's preallocation: 7.8 seconds
  • Dan's vectorized indices: 5.3 seconds
  • Mohsen's bsxfun: 1.5 seconds
  • my solution: 1.3 seconds



回答2:


You can use bsxfun which is generally a faster solution:

s = [xl(:) yl(:) zl(:)];
d = sqrt(sum(bsxfun(@minus, permute(s, [1 3 2]), permute(s, [3 1 2])).^2,3));



回答3:


You can do this fully vectorized like so:

n = numel(xl);
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:)
Iy = Y(:)
reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);

If you look at Ix and Iy (try it for like a 3x3 dataset), they make every combination of linear indexes possible for each of your matrices. Now you can just do each subtraction in one shot!

However mixing the suggestions of shoelzer and Jost will give you an almost identical performance performance boost:

n = 50;

xl = rand(n,1);
yl = rand(n,1);
zl = rand(n,1);

tic
for t = 1:100
    distanceMatrix = zeros(n); %// Preallocation
    for i=1:n
       for j=min(i+1,n):n %// Taking advantge of symmetry
           distanceMatrix(i,j) = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+(zl(i)-zl(j))^2);
       end
    end
    d1 = distanceMatrix + distanceMatrix';           %'
end
toc


%// Vectorized solution that creates linear indices using meshgrid
tic
for t = 1:100
    [X, Y] = meshgrid(1:n,1:n);
    Ix = X(:);
    Iy = Y(:);
    d2 = reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
end
toc

Returns:

Elapsed time is 0.023332 seconds.
Elapsed time is 0.024454 seconds.

But if I change n to 500 then I get

Elapsed time is 1.227956 seconds.
Elapsed time is 2.030925 seconds.

Which just goes to show that you should always bench mark solutions in Matlab before writing off loops as slow! In this case, depending on the scale of your solution, loops could be significantly faster.




回答4:


Be sure to preallocate distanceMatrix. Your loops will run much, much faster and vectorization probably isn't needed. Even if you do it, there may not be any further speed increase.




回答5:


The latest versions (Since R2016b) of MATLAB support Implicit Broadcasting (See also noted on bsxfun()).

Hence the fastest way for distance matrix is:

function [ mDistMat ] = CalcDistanceMatrix( mA, mB )

mDistMat = sum(mA .^ 2).' - (2 * mA.' * mB) + sum(mB .^ 2);


end

Where the points are along the columns of the set.
In your case mA = mB.

Have a look on my Calculate Distance Matrix Project.



来源:https://stackoverflow.com/questions/19360047/how-to-build-a-distance-matrix-without-a-loop-vectorization

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!