Finding (multiset) difference between two arrays

橙三吉。 提交于 2019-12-01 17:50:57

Still another approach using the histc function:

A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];

uA  = unique(A);
hca = histc(A,uA); 
hcb = histc(B,uA);
res = repelem(uA,hca-hcb)

We simply calculate the number of repeated elements for each vectors according to the unique value of vector A, then we use repelem to create the result.

This solution do not preserve the initial order but it don't seems to be a problem for you.

I use histc for Octave compatibility, but this function is deprecated so you can also use histcounts

Here's a vectorized way. Memory-inefficient, mostly for fun:

tA = sum(triu(bsxfun(@eq, A, A.')), 1);
tB = sum(triu(bsxfun(@eq, B, B.')), 1);
result = setdiff([A; tA].', [B; tB].', 'rows', 'stable');
result = result(:,1).';

The idea is to make each entry unique by tagging it with an occurrence number. The vectors become 2-column matrices, setdiff is applied with the 'rows' option, and then the tags are removed from the result.

I'm not a fan of loops, but for random perturbations of A this was the best I came up with.

C = A;
for x = 1:numel(B)
C(find(C == B(x), 1, 'first')) = [];
end

I was curious about looking at the affect of different orders of A on a solution approach so I setup a test like this:

Ctruth = [1 3 3 4 5 5 6];
for testNumber = 1:100
    Atest = A(randperm(numel(A)));
    C = myFunction(Atest,B);
    C = sort(C);
    assert(all(C==Ctruth));
end

You can use the second output of ismember to find the indexes where elements of B are in A, and diff to remove duplicates:

This answer assumes that B is already sorted. If that is not the case, B has to be sorted before executing above solution.

For the first example:

A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;

>>C

4     6     4     3     1     5

For the second example:

A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 4, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;
>>C

6     4     3     3     1     5

Strongly inspired by Matt, but on my machine 40% faster:

function A = multiDiff(A,B)
for j = 1:numel(B)
    for i = 1:numel(A)
        if A(i) == B(j)
            A(i) = [];
            break;
        end
    end
end
end
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!