Efficiently split a large audio file in R

问题

Previously I asked this question on SO about splitting an audio file. The answer I got from @Jean V. Adams worked relatively (downside: input was stereo and output was mono, not stereo) well for small sound objects:

library(seewave)

# your audio file (using example file from seewave package)
data(tico)
audio <- tico # this is an S4 class object
# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq

# the duration that you want to chop the file into
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)
# a list of all the segments
subsamps <- lapply(index, function(i) cutw(audio, f=freq, from=breaks[i], to=breaks[i+1]))

I applied this solution to one (out of around 300) of the files I'm preparing for analysis (~150 MB), and my computer worked on it for ( > 5 hours now), but I ended up closing the session before it finished.

Does anyone have any thoughts or solutions to efficiently perform this task of splitting up a large audio file (specifically, an S4 class Wave object) into smaller pieces using R? I'm hoping to cut down drastically on the time it takes to make smaller files out of these larger files, and I'm hoping to use R. However, if I cannot get R to do the task efficiently, I would appreciate suggestions of other tools for the job. The example data above is mono, but my data is in stereo. The example data can be made to be stereo using:

tico@stereo <- TRUE
tico@right <- tico@left

UPDATE

I identified another solution that builds on work from the first solution:

lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])

Comparing the performance of three solutions:

# Solution suggested by @Jean V. Adams
system.time(replicate(100,lapply(index, function(i) cutw(audio, f=freq, from=breaks[i], to=breaks[i+1], output="Wave"))))
user  system elapsed 
1.19    0.00    1.19 
# my modification of the previous solution
system.time(replicate(100,lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])))
user  system elapsed 
0.86    0.00    0.85 

# solution suggested by @CarlWitthoft 
audiomod <- audio[(freq*breaks[1]):(freq*breaks[length(breaks)-1])] # remove unequal part at end
system.time(replicate(100,matrix(audiomod@left,ncol=length(breaks))))+
system.time(replicate(100,matrix(audiomod@right,ncol=length(breaks))))
user  system elapsed 
0.25    0.00    0.26

The method using indexing (i.e. [) seems to faster (3-4x). @CarlWitthoft's solution is even faster, the downside is that it puts the data into a matrix rather than multiple Wave objects, which I will be saving using writeWave. Presumably, convert from the matrix format to a separate Wave objects will be relatively trivial if I properly understand how to create this type of S4 object. Any further room for improvement?

回答1:

The approach I ended up using builds off of the solutions offered by @CarlWitthoft and @JeanV.Adams. It is quite fast compared to the other techniques I was using, and it has allowed me to split a large number of my files in a matter of hours, rather than days.

Here is the whole process using a small Wave object for example (my current audio files range up to 150 MB in size, but in the future, I may receive much larger files (i.e. sound files covering 12-24 hours of recording) where memory management will become more important):

library(seewave)
library(tuneR)

data(tico)

# force to stereo
tico@stereo <- TRUE
tico@right <- tico@left    
audio <- tico # this is an S4 class object


# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq 

# the duration that you want to chop the file into (in seconds)
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)

# the split
leftmat<-matrix(audio@left, ncol=(length(breaks)-2), nrow=seglen*freq) 
rightmat<-matrix(audio@right, ncol=(length(breaks)-2), nrow=seglen*freq)
# the warnings are nothing to worry about here... 

# convert to list of Wave objects.
subsamps0409_180629 <- lapply(1:ncol(leftmat), function(x)Wave(left=leftmat[,x],
         right=rightmat[,x], samp.rate=d@samp.rate,bit=d@bit)) 


# get the last part of the audio file.  the part that is < seglen
lastbitleft <- d@left[(breaks[length(breaks)-1]*freq):length(d)]
lastbitright <- d@right[(breaks[length(breaks)-1]*freq):length(d)]

# convert and add the last bit to the list of Wave objects
subsamps0409_180629[[length(subsamps0409_180629)+1]] <- 
     Wave(left=lastbitleft, right=lastbitright, samp.rate=d@samp.rate, bit=d@bit)

This wasn't part of my original question, but my ultimate goal was to save these new, smaller Wave objects.

# finally, save the Wave objects
setwd("C:/Users/Whatever/Wave_object_folder")

# I had some memory management issues on my computer when doing this
# process with large (~ 130-150 MB) audio files so I used rm() and gc(),
# which seemed to resolve the problems I had with allocating memory.
rm("breaks","audio","freq","index","lastbitleft","lastbitright","leftmat",
  "rightmat","seglen","totlen","totsec")

gc()

filenames <- paste("audio","_split",1:(length(breaks)-1),".wav",sep="")

# Save the files
sapply(1:length(subsamps0409_180629),
       function(x)writeWave(subsamps0409_180629[[x]], 
       filename=filenames[x]))

The only real downside here is that the output files are pretty big. For example, I put in a 130 MB file and split it into 18 files each approximately 50 MB. I think this is because my input file is .mp3 and the output is .wav. I posted this answer to my own question in order to wrap up the problem I was having with the full solution I used to solve it, but other answers are appreciated and I will take the time to look at each solution and evaluate what they offer. I am sure there are better ways to accomplish this task, and methods that will work better with very large audio files. In solving this problem, I barely scratched the surface on dealing with memory management.

回答2:

Per Frank's request, here's one possible approach. Extract the audio@left and audio@right slots' vectors of sound data, then break each up into equal-length sections in one step something like:

leftsong<-audio@left
leftmat<-matrix(leftsong, ncol=(seglen*freq)

Where I've assumed seglen is the distance between breaks[i] and breaks[i+1] . New wave objects can then be created and processed from the matching rows in leftmat and rightmat.

来源：https://stackoverflow.com/questions/20713513/efficiently-split-a-large-audio-file-in-r

标签

performance

audio

file-io

split