可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am finding that a lot of time spent in my matlab function is in this code:
intersect(freq_bins, our_bins);
Both can be rather large vectors, and are comprised of only integers. I just need to know which integers are in both. This is truly the primitive purpose of intersect(), so I suspect that the answer is: it doesn't get any better. But maybe someone has some suggestions.
回答1:
intersect calls ismember. In your case, you don't need all the complicated checks that intersect does, so you can save some overhead and call ismember directly (note: I made sure to call both functions before timing them):
a = randi(1000,100,1); b = randi(1000,100,1); >> tic,intersect(a,b),toc ans = 76 338 490 548 550 801 914 930 Elapsed time is 0.027104 seconds. >> tic,a(ismember(a,b)),toc ans = 914 801 490 548 930 550 76 338 Elapsed time is 0.000613 seconds.
You can make this even faster by calling ismembc, the function that does the actual testing, directly. Note that ismembc requires sorted arrays (so you can drop the sort if your input is sorted already!)
tic,a=sort(a);b=sort(b);a(ismembc(a,b)),toc ans = 76 338 490 548 550 801 914 930 Elapsed time is 0.000473 seconds.
回答2:
If you can assume that your inputs contain sorted lists of unique integers, then you can do this in linear time with a very simple algorithm:
function c = intersect_sorted(a,b) ia = 1; na = length(a); ib = 1; nb = length(b); ic = 0; cn = min(na,nb); c = zeros(1,cn); while (ia <= na && ib <= nb) if (a(ia) > b(ib)) ib = ib + 1; elseif a(ia) < b(ib) ia = ia + 1; else % a(ia) == b(ib) ic = ic + 1; c(ic) = a(ia); ib = ib + 1; ia = ia + 1; end end c = c(1:ic); end
The max runtime for lists of length n and m will be O(n+m).
>>a = unique(randi(1000,100,1)); >>b = unique(randi(1000,100,1)); >>tic;for i = 1:10000, intersect(a,b); end,toc Elapsed time is 1.224514 seconds. >> tic;for i = 1:10000, intersect_sorted(a,b); end,toc Elapsed time is 0.289075 seconds.