How to Build a Distance Matrix without a Loop (Vectorization)?

前提是你 提交于 2019-12-01 06:39:10

How about using some linear algebra? The distance of two points can be computed from the inner product of their position vectors,

D(x, y) = ∥y – x∥ = √ ( xT x + yT y – 2 xT y ),

and the inner product for all pairs of points can be obtained through a simple matrix operation.

x = [xl(:)'; yl(:)'; zl(:)'];
IP = x' * x;
d = sqrt(bsxfun(@plus, diag(IP), diag(IP)') - 2 * IP);

For 10000 points, I get the following timing results:

  • ahmad's loop + shoelzer's preallocation: 7.8 seconds
  • Dan's vectorized indices: 5.3 seconds
  • Mohsen's bsxfun: 1.5 seconds
  • my solution: 1.3 seconds

You can use bsxfun which is generally a faster solution:

s = [xl(:) yl(:) zl(:)];
d = sqrt(sum(bsxfun(@minus, permute(s, [1 3 2]), permute(s, [3 1 2])).^2,3));
Dan

You can do this fully vectorized like so:

n = numel(xl);
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:)
Iy = Y(:)
reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);

If you look at Ix and Iy (try it for like a 3x3 dataset), they make every combination of linear indexes possible for each of your matrices. Now you can just do each subtraction in one shot!

However mixing the suggestions of shoelzer and Jost will give you an almost identical performance performance boost:

n = 50;

xl = rand(n,1);
yl = rand(n,1);
zl = rand(n,1);

tic
for t = 1:100
    distanceMatrix = zeros(n); %// Preallocation
    for i=1:n
       for j=min(i+1,n):n %// Taking advantge of symmetry
           distanceMatrix(i,j) = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+(zl(i)-zl(j))^2);
       end
    end
    d1 = distanceMatrix + distanceMatrix';           %'
end
toc


%// Vectorized solution that creates linear indices using meshgrid
tic
for t = 1:100
    [X, Y] = meshgrid(1:n,1:n);
    Ix = X(:);
    Iy = Y(:);
    d2 = reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
end
toc

Returns:

Elapsed time is 0.023332 seconds.
Elapsed time is 0.024454 seconds.

But if I change n to 500 then I get

Elapsed time is 1.227956 seconds.
Elapsed time is 2.030925 seconds.

Which just goes to show that you should always bench mark solutions in Matlab before writing off loops as slow! In this case, depending on the scale of your solution, loops could be significantly faster.

Be sure to preallocate distanceMatrix. Your loops will run much, much faster and vectorization probably isn't needed. Even if you do it, there may not be any further speed increase.

The latest versions (Since R2016b) of MATLAB support Implicit Broadcasting (See also noted on bsxfun()).

Hence the fastest way for distance matrix is:

function [ mDistMat ] = CalcDistanceMatrix( mA, mB )

mDistMat = sum(mA .^ 2).' - (2 * mA.' * mB) + sum(mB .^ 2);


end

Where the points are along the columns of the set.
In your case mA = mB.

Have a look on my Calculate Distance Matrix Project.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!