IBM Watson Speech-to-Text “unable to transcode data stream audio/webm -> audio/x-float-array” media MIME types

拜拜、爱过 提交于 2020-03-25 12:30:42

问题


I'm recording short audio files (a few seconds) in Chrome using mediaDevices.getUserMedia(), saving the file to Firebase Storage, and then trying to send the files to IBM Watson Speech-to-Text. I'm getting back this error message:

unable to transcode data stream audio/webm -> audio/x-float-array

In the browser I set up the microphone:

navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {

var options = {
   audioBitsPerSecond : 128000,
   mimeType : 'audio/webm'
};

const mediaRecorder = new MediaRecorder(stream, options);
mediaRecorder.start();
...

According to this answer Chrome only supports two media types

audio/webm
audio/webm;codecs=opus

I tried both.

Here's what I sent to IBM Watson:

curl -X POST -u "apikey:my-api-key" \
--header "Content-Type: audio/webm" \
--data-binary "https://firebasestorage.googleapis.com/v0/b/my-app.appspot.com/my-file" \
--url "https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/01010101/v1/recognize"

The list of supported MIME types includes webm and webm;codecs=opus.

I tried recording and sending a ogg format file, and got the same error message:

curl -X POST -u "apikey:my-api-key" \
--header "Content-Type: audio/ogg" \
--data-binary @/Users/TDK/LanguageTwo/public/1.ogg \
--url "https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/01010101/v1/recognize"

I tried IBM's sample audio file and it worked perfectly:

"transcript": "several tornadoes touched down as a line of severe thunderstorms swept through Colorado on Sunday "

I'm getting a similar error message from Google Cloud Speech-to-Text.

来源:https://stackoverflow.com/questions/60748633/ibm-watson-speech-to-text-unable-to-transcode-data-stream-audio-webm-audio-x

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!