I have some audio data loaded in a numpy array and I wish to segment the data by finding silent parts, i.e. parts where the audio amplitude is below a certain threshold over
I know I'm late to the party, but another way to do this is with 1d convolutions:
np.convolve(sig > threshold, np.ones((cons_samples)), 'same') == cons_samples
Where cons_samples
is the number of consecutive samples you require above threshold
@joe-kington I've got about 20%-25% speed improvement over np.diff / np.nonzero
solution by using argmax
instead (see code below, condition
is boolean)
def contiguous_regions(condition):
idx = []
i = 0
while i < len(condition):
x1 = i + condition[i:].argmax()
try:
x2 = x1 + condition[x1:].argmin()
except:
x2 = x1 + 1
if x1 == x2:
if condition[x1] == True:
x2 = len(condition)
else:
break
idx.append( [x1,x2] )
i = x2
return idx
Of course, your mileage may vary depending on your data.
Besides, I'm not entirely sure, but i guess numpy may optimize argmin/argmax
over boolean arrays to stop searching on first True/False
occurrence. That might explain it.
another way to do this quickly and concisely:
import pylab as pl
v=[0,0,1,1,0,0,1,1,1,1,1,0,1,0,1,1,0,0,0,0,0,1,0,0]
vd = pl.diff(v)
#vd[i]==1 for 0->1 crossing; vd[i]==-1 for 1->0 crossing
#need to add +1 to indexes as pl.diff shifts to left by 1
i1=pl.array([i for i in xrange(len(vd)) if vd[i]==1])+1
i2=pl.array([i for i in xrange(len(vd)) if vd[i]==-1])+1
#corner cases for the first and the last element
if v[0]==1:
i1=pl.hstack((0,i1))
if v[-1]==1:
i2=pl.hstack((i2,len(v)))
now i1 contains the beginning index and i2 the end index of 1,...,1 areas
Slightly sloppy, but simple and fast-ish, if you don't mind using scipy:
from scipy.ndimage import gaussian_filter
sigma = 3
threshold = 1
above_threshold = gaussian_filter(data, sigma=sigma) > threshold
The idea is that quiet portions of the data will smooth down to low amplitude, and loud regions won't. Tune 'sigma' to affect how long a 'quiet' region must be; tune 'threshold' to affect how quiet it must be. This slows down for large sigma, at which point using FFT-based smoothing might be faster.
This has the added benefit that single 'hot pixels' won't disrupt your silence-finding, so you're a little less sensitive to certain types of noise.
I haven't tested this but you it should be close to what you are looking for. Slightly more lines of code but should be more efficient, readable, and it doesn't abuse regular expressions :-)
def find_silent(samples):
num_silent = 0
start = 0
for index in range(0, len(samples)):
if abs(samples[index]) < SILENCE_THRESHOLD:
if num_silent == 0:
start = index
num_silent += 1
else:
if num_silent > MIN_SILENCE:
yield samples[start:index]
num_silent = 0
if num_silent > MIN_SILENCE:
yield samples[start:]
for match in find_silent(samples):
# code goes here
Here's a numpy-based solution.
I think (?) it should be faster than the other options. Hopefully it's fairly clear.
However, it does require a twice as much memory as the various generator-based solutions. As long as you can hold a single temporary copy of your data in memory (for the diff), and a boolean array of the same length as your data (1-bit-per-element), it should be pretty efficient...
import numpy as np
def main():
# Generate some random data
x = np.cumsum(np.random.random(1000) - 0.5)
condition = np.abs(x) < 1
# Print the start and stop indicies of each region where the absolute
# values of x are below 1, and the min and max of each of these regions
for start, stop in contiguous_regions(condition):
segment = x[start:stop]
print start, stop
print segment.min(), segment.max()
def contiguous_regions(condition):
"""Finds contiguous True regions of the boolean array "condition". Returns
a 2D array where the first column is the start index of the region and the
second column is the end index."""
# Find the indicies of changes in "condition"
d = np.diff(condition)
idx, = d.nonzero()
# We need to start things after the change in "condition". Therefore,
# we'll shift the index by 1 to the right.
idx += 1
if condition[0]:
# If the start of condition is True prepend a 0
idx = np.r_[0, idx]
if condition[-1]:
# If the end of condition is True, append the length of the array
idx = np.r_[idx, condition.size] # Edit
# Reshape the result into two columns
idx.shape = (-1,2)
return idx
main()