Extract separate non-zero blocks from array

前端 未结 4 933
长发绾君心
长发绾君心 2020-12-30 13:05

having an array like this for example:

[1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1]

What\'s the fastest way in Pytho

相关标签:
4条回答
  • 2020-12-30 13:22

    A trivial change to my answer at Finding the consecutive zeros in a numpy array gives the function find_runs:

    def find_runs(value, a):
        # Create an array that is 1 where a is `value`, and pad each end with an extra 0.
        isvalue = np.concatenate(([0], np.equal(a, value).view(np.int8), [0]))
        absdiff = np.abs(np.diff(isvalue))
        # Runs start and end where absdiff is 1.
        ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
        return ranges
    

    For example,

    In [43]: x
    Out[43]: array([1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1])
    
    In [44]: find_runs(1, x)
    Out[44]: 
    array([[ 0,  4],
           [ 9, 12],
           [14, 16],
           [20, 22]])
    
    In [45]: [range(*run) for run in find_runs(1, x)]
    Out[45]: [[0, 1, 2, 3], [9, 10, 11], [14, 15], [20, 21]]
    

    If the value 1 in your example was not representative, and you really want runs of any non-zero values (as suggested by the text of the question), you can change np.equal(a, value) to (a != 0) and change the arguments and comments appropriately. E.g.

    def find_nonzero_runs(a):
        # Create an array that is 1 where a is nonzero, and pad each end with an extra 0.
        isnonzero = np.concatenate(([0], (np.asarray(a) != 0).view(np.int8), [0]))
        absdiff = np.abs(np.diff(isnonzero))
        # Runs start and end where absdiff is 1.
        ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
        return ranges
    

    For example,

    In [63]: y
    Out[63]: 
    array([-1,  2, 99, 99,  0,  0,  0,  0,  0, 12, 13, 14,  0,  0,  1,  1,  0,
            0,  0,  0, 42, 42])
    
    In [64]: find_nonzero_runs(y)
    Out[64]: 
    array([[ 0,  4],
           [ 9, 12],
           [14, 16],
           [20, 22]])
    
    0 讨论(0)
  • 2020-12-30 13:29

    You can use np.split, once you know the interval of non-zeros' lengths and the corresponding indices in A. Assuming A as the input array, the implementation would look something like this -

    # Append A on either sides with zeros
    A_ext = np.diff(np.hstack(([0],A,[0])))
    
    # Find interval of non-zeros lengths
    interval_lens = np.where(A_ext==-1)[0] - np.where(A_ext==1)[0]
    
    # Indices of non-zeros places in A
    idx = np.arange(A.size)[A!=0]
    
    # Finally split indices based on the interval lengths
    out = np.split(idx,interval_lens.cumsum())[:-1]
    

    Sample input, output -

    In [53]: A
    Out[53]: array([1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1])
    
    In [54]: out
    Out[54]: [array([0, 1, 2, 3]), array([ 9, 10, 11]), array([14, 15]), array([20, 21])]
    
    0 讨论(0)
  • 2020-12-30 13:30

    Have a look at scipy.ndimage.measurements.label:

    import numpy as np
    from scipy.ndimage.measurements import label
    
    x = np.asarray([1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1])
    labelled, numfeats = label(x)
    indices = [np.nonzero(labelled == k) for k in np.unique(labelled)[1:]]
    

    indices contains exactly what you asked for. Note that, depending on your ultimate goal, labelled might also give you useful (extra) information.

    0 讨论(0)
  • 2020-12-30 13:37
    >>> L = [1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1]
    >>> import itertools
    >>> import operator
    >>> [[i for i,value in it] for key,it in itertools.groupby(enumerate(L), key=operator.itemgetter(1)) if key != 0]
    
    [[0, 1, 2, 3], [9, 10, 11], [14, 15], [20, 21]]
    
    0 讨论(0)
提交回复
热议问题