Faster version of find for sorted vectors (MATLAB)

前端 未结 5 1161
心在旅途
心在旅途 2020-11-27 17:51

I have code of the following kind in MATLAB:

indices = find([1 2 2 3 3 3 4 5 6 7 7] == 3)

This returns 4,5,6 - the indices of elements in t

5条回答
  •  遥遥无期
    2020-11-27 18:47

    ismember will give you all the indexes if you look at the first output:

    >> x = [1 2 2 3 3 3 4 5 6 7 7];
    >> [tf,loc]=ismember(x,3);
    >> inds = find(tf)
    

    inds =

     4     5     6
    

    You just need to use the right order of inputs.

    Note that there is a helper function used by ismember that you can call directly:

    % ISMEMBC  - S must be sorted - Returns logical vector indicating which 
    % elements of A occur in S
    
    tf = ismembc(x,3);
    inds = find(tf);
    

    Using ismembc will save computation time since ismember calls issorted first, but this will omit the check.

    Note that newer versions of matlab have a builtin called by builtin('_ismemberoneoutput',a,b) with the same functionality.


    Since the above applications of ismember, etc. are somewhat backwards (searching for each element of x in the second argument rather than the other way around), the code is much slower than necessary. As the OP points out, it is unfortunate that [~,loc]=ismember(3,x) only provides the location of the first occurrence of 3 in x, rather than all. However, if you have a recent version of MATLAB (R2012b+, I think), you can use yet more undocumented builtin functions to get the first an last indexes! These are ismembc2 and builtin('_ismemberfirst',searchfor,x):

    firstInd = builtin('_ismemberfirst',searchfor,x);  % find first occurrence
    lastInd = ismembc2(searchfor,x);                   % find last occurrence
    % lastInd = ismembc2(searchfor,x(firstInd:end))+firstInd-1; % slower
    inds = firstInd:lastInd;
    

    Still slower than Daniel R.'s great MATLAB code, but there it is (rntmX added to randomatlabuser's benchmark) just for fun:

    mean([rntm1 rntm2 rntm3 rntmX])    
    ans =
       0.559204323050486   0.263756852283128   0.000017989974213   0.000153682125682
    

    Here are the bits of documentation for these functions inside ismember.m:

    % ISMEMBC2 - S must be sorted - Returns a vector of the locations of
    % the elements of A occurring in S.  If multiple instances occur,
    % the last occurrence is returned
    
    % ISMEMBERFIRST(A,B) - B must be sorted - Returns a vector of the
    % locations of the elements of A occurring in B.  If multiple
    % instances occur, the first occurence is returned.
    

    There is actually reference to an ISMEMBERLAST builtin, but it doesn't seem to exist (yet?).

提交回复
热议问题