Google Cloud Speech-to-Text: “INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..” codec audio encoding error

柔情痞子 提交于 2020-03-25 12:30:36

问题


I'm recording short audio files (a few seconds) in Chrome using mediaDevices.getUserMedia(), saving the file to Firebase Storage, and then trying to send the files to Google Cloud Speech-to-Text from a Firebase Cloud Function. I'm getting back this error message:

INVALID_ARGUMENT: Invalid recognition 'config': bad encoding.

Google's documentation says that this error message means

Your audio data might not be encoded correctly or is encoded with a codec different than what you've declared in the RecognitionConfig. Check the audio input and make sure that you've set the encoding field correctly.

In the browser I set up the microphone:

navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {

var options = {
   audioBitsPerSecond : 128000,
   mimeType : 'audio/webm;codecs=opus'
};

const mediaRecorder = new MediaRecorder(stream, options);
mediaRecorder.start();
...

According to this answer Chrome only supports two codecs:

audio/webm
audio/webm;codecs=opus

Actually, that's one media format and one codec. This blog post also says that Chrome only supports the Opus codec.

I set up my Firebase Cloud Function:

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

// Creates a client
const client = new speech.SpeechClient();

const gcsUri = 'gs://my-app.appspot.com/my-file';
const encoding = 'Opus';
const sampleRateHertz = 128000;
const languageCode = 'en-US';

const config = {
   encoding: encoding,
   sampleRateHertz: sampleRateHertz,
   languageCode: languageCode,
};
const audio = {
   uri: gcsUri,
};

const request = {
   config: config,
   audio: audio,
};

// Detects speech in the audio file
return response = client.recognize(request) // square brackets in ES6 construct an array
.then(function(response) {
console.log(response);
...

The audio encoding matches between the browser and the Google Speech-to-Text request. Why does Google Speech tell me that the audio encoding is bad?

I also tried using the default options in the browser, with the same error message:

navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {

const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start();

In the Firebase Cloud Function I tried leaving out the line const encoding = 'Opus';, which resulted in an error encoding is not defined. I tried this line const encoding = ''; which resulted in the INVALID_ARGUMENT: Invalid recognition 'config': bad encoding.. error.

I'm getting a similar error message from IBM Watson Speech-to-Text. The file plays back without a problem.

来源:https://stackoverflow.com/questions/60747880/google-cloud-speech-to-text-invalid-argument-invalid-recognition-config-ba

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!