How to remove cell elements that are a subset of a larger element? (Matlab)

房东的猫 提交于 2020-05-15 06:20:26

问题


For example, if I have a cell with arrays listed inside:

C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]}

I want to output:

D = {[1,2,3,4], [4,5,6], [7]}

What is the most efficient way to remove cell elements that are already included/subset in another larger element?

My existing algorithm loops through each element, compares it to each element in the new cell list and updates the new cell list accordingly but it is extremely slow and inefficient (my original cell contains >200 array elements).


回答1:


Here is an approach using ,very fast, matrix multiplication. You can convert C to a ,sparse, binary matrix b that each element of C is related to each row of the matrix. So columns [1 2 3 4] of the first row and columns [3 4] of the second row are set to 1 and so on. Multiplying the matrix by its transpose we can find the number of corresponding elements between all pairs of rows. using that information we can find cell elements that are subset of others.

C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]};
n = cellfun(@numel,C);      % find length of each element.
v = repelem(1:numel(C),n);  % generate indices for rows of the binary matrix
[~,~,u] = unique([C{:}]);   % generate indices for rows of the binary matrix
b = accumarray([v(:),u(:)],ones(size(v)),[],@max,[],true); % generate the binary matrix
s = b * b.';                % multiply by its transpose
s(1:size(s,1)+1:end) = 0;   % set diagonal elements to 0(we do not need self similarity)
result = C(max(s)<n)        % remove an element if it is a subset and preserve others


来源:https://stackoverflow.com/questions/46859164/how-to-remove-cell-elements-that-are-a-subset-of-a-larger-element-matlab

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!