Find large number of consecutive values fulfilling condition in a numpy array

后端 未结 8 1549
悲&欢浪女
悲&欢浪女 2020-12-05 00:53

I have some audio data loaded in a numpy array and I wish to segment the data by finding silent parts, i.e. parts where the audio amplitude is below a certain threshold over

8条回答
  •  -上瘾入骨i
    2020-12-05 01:48

    Here's a numpy-based solution.

    I think (?) it should be faster than the other options. Hopefully it's fairly clear.

    However, it does require a twice as much memory as the various generator-based solutions. As long as you can hold a single temporary copy of your data in memory (for the diff), and a boolean array of the same length as your data (1-bit-per-element), it should be pretty efficient...

    import numpy as np
    
    def main():
        # Generate some random data
        x = np.cumsum(np.random.random(1000) - 0.5)
        condition = np.abs(x) < 1
    
        # Print the start and stop indicies of each region where the absolute 
        # values of x are below 1, and the min and max of each of these regions
        for start, stop in contiguous_regions(condition):
            segment = x[start:stop]
            print start, stop
            print segment.min(), segment.max()
    
    def contiguous_regions(condition):
        """Finds contiguous True regions of the boolean array "condition". Returns
        a 2D array where the first column is the start index of the region and the
        second column is the end index."""
    
        # Find the indicies of changes in "condition"
        d = np.diff(condition)
        idx, = d.nonzero() 
    
        # We need to start things after the change in "condition". Therefore, 
        # we'll shift the index by 1 to the right.
        idx += 1
    
        if condition[0]:
            # If the start of condition is True prepend a 0
            idx = np.r_[0, idx]
    
        if condition[-1]:
            # If the end of condition is True, append the length of the array
            idx = np.r_[idx, condition.size] # Edit
    
        # Reshape the result into two columns
        idx.shape = (-1,2)
        return idx
    
    main()
    

提交回复
热议问题