I\'ve more than 200 MP3 files and I need to split each one of them by using silence detection. I tried Audacity and WavePad but they do not have batch processes and it\'s ve
Having tested all of these solutions and none of them having worked for me I have found a solution that worked for me and is relatively fast.
Prerequisites:
ffmpeg
numpy
(although it doesn't need much from numpy and a solution without numpy
would probably be relatively easy to write and further increase speed)Mode of operation, rationale:
ffmpeg
convert the input to a lossless 16-bit 22kHz PCM and pass it back via subprocess.Popen
, with the advantage that ffmpeg
does so very fast and in little chunks which do not occupy much memory.numpy
arrays of the last and before last buffer are concatenated and checked if they surpass the given threshold. If they don't, it means there is a block of silence, and (naively I admit) simply count the time where there is "silence". If the time is at least as long as the given min. silence duration, (again naively) the middle of this current interval is taken as the splitting moment.ffmpeg
to take segments bounded by these "silences" and save them into separate files.The little code:
import subprocess as sp
import sys
import numpy
FFMPEG_BIN = "ffmpeg.exe"
print 'ASplit.py '
src = sys.argv[1]
dur = float(sys.argv[2])
thr = int(float(sys.argv[3]) * 65535)
f = open('%s-out.bat' % src, 'wb')
tmprate = 22050
len2 = dur * tmprate
buflen = int(len2 * 2)
# t * rate * 16 bits
oarr = numpy.arange(1, dtype='int16')
# just a dummy array for the first chunk
command = [ FFMPEG_BIN,
'-i', src,
'-f', 's16le',
'-acodec', 'pcm_s16le',
'-ar', str(tmprate), # ouput sampling rate
'-ac', '1', # '1' for mono
'-'] # - output to stdout
pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)
tf = True
pos = 0
opos = 0
part = 0
while tf :
raw = pipe.stdout.read(buflen)
if raw == '' :
tf = False
break
arr = numpy.fromstring(raw, dtype = "int16")
rng = numpy.concatenate([oarr, arr])
mx = numpy.amax(rng)
if mx <= thr :
# the peak in this range is less than the threshold value
trng = (rng <= thr) * 1
# effectively a pass filter with all samples <= thr set to 0 and > thr set to 1
sm = numpy.sum(trng)
# i.e. simply (naively) check how many 1's there were
if sm >= len2 :
part += 1
apos = pos + dur * 0.5
print mx, sm, len2, apos
f.write('ffmpeg -i "%s" -ss %f -to %f -c copy -y "%s-p%04d.mp3"\r\n' % (src, opos, apos, src, part))
opos = apos
pos += dur
oarr = arr
part += 1
f.write('ffmpeg -i "%s" -ss %f -to %f -c copy -y "%s-p%04d.mp3"\r\n' % (src, opos, pos, src, part))
f.close()