- Desired Behaviour
- Actual Behaviour
- What I\'ve Tried
- Steps To Reproduce
- Research
I'll give my two cents here, since I looked at a similar question recently! From what I have tested, and researched, you can combine the two .mp3 / .wav streams into one. This results in a file that has noticable issues as you've mentioned such as truncation, glitches etc.
The only way I believe you can combine the Audio streams correctly will be with a module that is designed to concatenate sound files/data.
The best result I have obtained is to synthesize the audio into separate files, then combine like so:
function combineMp3Files(files, outputFile) {
const ffmpeg = require("fluent-ffmpeg");
const combiner = ffmpeg().on("error", err => {
console.error("An error occurred: " + err.message);
})
.on("end", () => {
console.log('Merge complete');
});
// Add in each .mp3 file.
files.forEach(file => {
combiner.input(file)
});
combiner.mergeToFile(outputFile);
}
This uses the node-fluent-ffmpeg library, which requires installing ffmpeg.
Other than that I'd suggest you ask IBM support (because as you say the docs don't seem to indicate this) how API callers should combine the synthesized audio, since your use case will be very common.
To create the text files, I do the following:
// Switching to audio/webm and the V3 voices.. much better output
function synthesizeText(text) {
const synthesizeParams = {
text: text,
accept: 'audio/webm',
voice: 'en-US_LisaV3Voice'
};
return textToSpeech.synthesize(synthesizeParams);
}
async function synthesizeTextChunksSeparateFiles(text_chunks) {
const audioArray = await Promise.all(text_chunks.map(synthesizeText));
console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`);
audioArray.forEach((audio, index) => {
audio.pipe(fs.createWriteStream(`audio-${index}.mp3`));
});
}
And then combine like so:
combineMp3Files(['audio-0.mp3', 'audio-1.mp3', 'audio-2.mp3', 'audio-3.mp3', 'audio-4.mp3'], 'combined.mp3');
I should point out that I'm doing this in two separate steps (waiting a few hundred milliseconds would also work), but it should be easy enough to wait for the individual files to be written, then combine them.
Here's a function that will do this:
async function synthesizeTextChunksThenCombine(text_chunks, outputFile) {
const audioArray = await Promise.all(text_chunks.map(synthesizeText));
console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`);
let writePromises = audioArray.map((audio, index) => {
return new Promise((resolve, reject) => {
audio.pipe(fs.createWriteStream(`audio-${index}.mp3`).on('close', () => {
resolve(`audio-${index}.mp3`);
}));
})
});
let files = await Promise.all(writePromises);
console.log('synthesizeTextChunksThenCombine: Separate files: ', files);
combineMp3Files(files, outputFile);
}