Find elements meeting any of a number of criteria

后端 未结 4 1537
花落未央
花落未央 2020-12-19 05:27

I am trying to find the indices of elements in a vector that correspond to another vector, preferably without using loops. For example, my input might be:

DJ         


        
相关标签:
4条回答
  • 2020-12-19 05:47

    Divakar's answer is the way to go. But in case you want to do it more manually:

    [~, Output] = max(bsxfun(@eq, DJiSet(:).', JiSet(:)), [], 1);
    

    This finds the first occurrence if there are more than one.

    If the values in DJiSet were not guaranteed to be present in JiSet, you could use a small modification:

    [val, Output] = max(bsxfun(@eq, DJiSet(:).', JiSet(:))); %'
    Output(~val) = 0; %// 0 indicates "not found"
    
    0 讨论(0)
  • 2020-12-19 05:52

    Approach #1

    You can avoid find by reversing the places of DJiSet and JiSet inside ismember and then use the second output that gives us the matching indices -

    [~,out] = ismember(DJiSet,JiSet)
    

    Approach #2

    Loopy approach catering to the specific conditions set in the question could be tried out, not sure if this will be more efficient though -

    intv_idx = zeros(1,numel(DJiSet));
    intv_idx(1) = find(JiSet==DJiSet(1),1);
    start = intv_idx(1)+1;
    for k = 2:numel(DJiSet)
        idx = find(JiSet(start:end)==DJiSet(k),1);
        start = idx+start;
        intv_idx(k) = idx;
    end
    out = cumsum(intv_idx);
    
    0 讨论(0)
  • 2020-12-19 05:55

    Perhaps you could try intersect? It is suppose to be a lot faster:

    [Intersect,indDJiSet,indJiSet] = intersect(DJiSet,JiSet)
    

    The ordering does not matter, as long as the element exist in both list, the ind elements gives the index.

    0 讨论(0)
  • 2020-12-19 06:02

    For small datasets, it seems my original approach was faster than both the ismember solution proposed by Divakar and the intersect solution proposed by qmeeeeeee, but all three get beaten by Luis Mendo's solution using good old bsxfun. See below code, which times each approach:

    function somescript()
    
    IsmemberTime = timeit(@membersol)
    IntersectTime = timeit(@intersectsol)
    FindTime = timeit(@naivesol)
    BsxTime = timeit(@bsxfunsol)
    
        function membersol()
            rng(1)
            set = randi(30,[1000 15]);             % generate 1000 vectors of length 15, containing random integers
            for i=1:1000
                [~,out] = ismember(set(i,1:5),set(i,6:end));      % first 5 random integers are the values to be found in the remainder of the vector
            end
    
        end
    
        function intersectsol()
            rng(1)
            set = randi(30,[1000 15]);
            for i=1:1000
                [~,~,Output] = intersect(set(i,1:5),set(i,6:end));
            end
        end
    
        function naivesol()
            rng(1)
            set = randi(30,[1000 15]);
            for i=1:1000
                Output = find(ismember(set(i,6:end),set(i,1:5)));
            end
        end
    
        function bsxfunsol()
            rng(1)
            set = randi(30,[1000 15]);
            for i=1:1000
                [~, Output] = max(bsxfun(@eq, set(i,1:5).', set(i,6:end)), [], 1);
            end
        end
    end
    

    Which on my machine (running R2014b) returns the following timings:

    IsmemberTime =
    
        0.1101
    
    
    IntersectTime =
    
        0.2008
    
    
    FindTime =
    
        0.0698
    
    
    BsxTime =
    
        0.0218
    

    This suggests that, for small data sets at least, using find and ismember on the inverted order of vectors is actually faster than using ismember alone. Since there is also some fixed overhead for all methods from the generation of the datasets set that are used to test with, the difference seems to be pretty big. More thorough tests can ben found in the comments below.

    0 讨论(0)
提交回复
热议问题