How can a bot receive a voice file from Facebook Messenger (MP4) and convert it to a format that is recognized by speech engines like Bing or Google?
I'm trying to make a bot for Facebook Messenger using Microsoft's Bot Framework that will do this: Get a user's voice message sent via Facebook Messenger Convert speech to text Do something with it There's no problem with getting the voice message from Messenger (the URL can be extracted from the message the bot receives), and there's also no problem with converting an audio file to speech (using Bing Speech API or Google's similar API). However, these APIs require PCM (WAV) files, while Facebook Messenger gives you an MP4 file. Is there a popular/standard way of converting one format into