I'm trying to make a bot for Facebook Messenger using Microsoft's Bot Framework that will do this:
- Get a user's voice message sent via Facebook Messenger
- Convert speech to text
- Do something with it
There's no problem with getting the voice message from Messenger (the URL can be extracted from the message the bot receives), and there's also no problem with converting an audio file to speech (using Bing Speech API or Google's similar API).
However, these APIs require PCM (WAV) files, while Facebook Messenger gives you an MP4 file.
Is there a popular/standard way of converting one format into another that is used in writing the bots?
So far my best idea is to run vlc.exe
as a console job on my server and convert the file, but that doesn't sound like the best solution.
Developed a solution that works as follows:
- Receive voice message from facebook
- Download the MP4 file to local disk using the link inside
Activity.Attachments
- Use MediaToolKit (wrapper for FFMPEG) to convert MP4/AAC to WAV on local server
- Send the WAV to Bing Speech API
So the answer to my question is: use MediaToolKit+ffmpeg to convert the file format.
Sample implementation and code here: https://github.com/J3QQ4/Facebook-Messenger-Voice-Message-Converter
public string ConvertMP4ToWAV()
{
var inputFile = new MediaFile { Filename = SourceFileNameAndPath };
var outputFile = new MediaFile { Filename = ConvertedFileNameAndPath };
using (var engine = new Engine(GetFFMPEGBinaryPath()))
{
engine.Convert(inputFile, outputFile);
}
return ConvertedFileNameAndPath;
}
来源:https://stackoverflow.com/questions/43202310/how-can-a-bot-receive-a-voice-file-from-facebook-messenger-mp4-and-convert-it