问题
A and B are sets of m and n points respectively, where m<=n. I want to find a set of m unique points from B, named C, where the sum of distances between all [A(i), C(i)] pairs is the minimal.
To solve this without uniqueness constraint I can just find closest points from B to each point in A:
m = 5; n = 8; dim = 2;
A = rand(m, dim);
B = rand(n, dim);
D = pdist2(A, B);
[~, I] = min(D, [], 2);
C2 = B(I, :);
Where there may be repeated elements of B present in C. Now the first solution is brute-force search:
minSumD = inf;
allCombs = nchoosek(1:n, m);
for i = 1:size(allCombs, 1)
allPerms = perms(allCombs(i, :));
for j = 1:size(allPerms, 1)
ind = sub2ind([m n], 1:m, allPerms(j, :));
sumD = sum(D(ind));
if sumD<minSumD
minSumD = sumD;
I = allPerms(j, :);
end
end
end
C = B(I, :);
I think C2 (set of closest points to each A(i)) is pretty much alike C except for its repeated points. So how can I decrease the computation time?
回答1:
Use a variant of the Hungarian algorithm, which computes a minimum/maximum weight perfect matching. Create n-m dummy points for the unused B points to match with (or, if you're willing to put in more effort, adapt the Hungarian algorithm machinery to non square matrices).
来源:https://stackoverflow.com/questions/41936760/how-to-find-a-unique-set-of-closest-pairs-of-points