google-cloud-speech

com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Credentials failed to obtain metadata :cloud speech

雨燕双飞 提交于 2020-03-25 16:36:06
问题 i've done a speech to text configuration using google's cloud speech api in java, this works on other machine but the same setup does not work on my machine. i've installed google cloud platform tools from eclipse market place also i've set the environment credential variable for run time. can anyone please help ? 来源: https://stackoverflow.com/questions/59845869/com-google-api-gax-rpc-unavailableexception-io-grpc-statusruntimeexception-una

Google Cloud Speech-to-Text: “INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..” codec audio encoding error

柔情痞子 提交于 2020-03-25 12:30:36
问题 I'm recording short audio files (a few seconds) in Chrome using mediaDevices.getUserMedia() , saving the file to Firebase Storage, and then trying to send the files to Google Cloud Speech-to-Text from a Firebase Cloud Function. I'm getting back this error message: INVALID_ARGUMENT: Invalid recognition 'config': bad encoding. Google's documentation says that this error message means Your audio data might not be encoded correctly or is encoded with a codec different than what you've declared in

Speaker Diarization support in Google Speech API

橙三吉。 提交于 2020-01-24 14:11:25
问题 Is Google Cloud Speech API support speaker Diarization? as like Watson ? If so what the steps to get the transcript with speaker labled? More Info: https://www.ibm.com/blogs/watson/2016/12/look-whos-talking-ibm-debuts-watson-speech-text-speaker-diarization-beta/ 回答1: Google has introduced this feature and as on writing it's on BETA. More info - https://cloud.google.com/speech-to-text/ 回答2: Per the group discussion at Recording, Splitting Audio for Transcribing Two People Conversation using

Speaker Diarization support in Google Speech API

对着背影说爱祢 提交于 2020-01-24 14:11:11
问题 Is Google Cloud Speech API support speaker Diarization? as like Watson ? If so what the steps to get the transcript with speaker labled? More Info: https://www.ibm.com/blogs/watson/2016/12/look-whos-talking-ibm-debuts-watson-speech-text-speaker-diarization-beta/ 回答1: Google has introduced this feature and as on writing it's on BETA. More info - https://cloud.google.com/speech-to-text/ 回答2: Per the group discussion at Recording, Splitting Audio for Transcribing Two People Conversation using

Multiple StreamingRecognizeRequest

馋奶兔 提交于 2020-01-16 13:47:08
问题 I'm trying to setup a StreamingRecognize , with multiple request's. Is it possible ? The point is that i want to send audio stream from the mic with a unknown time, so i think that i must implement multiple requests. (Considering that a request session has a max_time = 65 seconds). Anyone can help me with this ? Thank's alot ;) Google sample code: static async Task<object> StreamingMicRecognizeAsync(int seconds) { if (NAudio.Wave.WaveIn.DeviceCount < 1) { Console.WriteLine("No microphone!");

Google Cloud Speech API capability for non-sense words or phonetics

馋奶兔 提交于 2020-01-05 05:58:21
问题 Is is possible for the API to return the phonetics of what the sound file says? Or, is it possible to provide non-real vocabulary words? I have a foreign language tutorial where I might be able to use this. It for examples teaches non-Latin alphabets like Cyrillic, Hebrew, Arabic, Chinese, etc... I have a library of non-sense words to help the student learn; the reason for non-sense words vs real words is that it breaks the steps down to just two letters at a time; and at first, there aren't

Google Cloud Speech API capability for non-sense words or phonetics

三世轮回 提交于 2020-01-05 05:58:13
问题 Is is possible for the API to return the phonetics of what the sound file says? Or, is it possible to provide non-real vocabulary words? I have a foreign language tutorial where I might be able to use this. It for examples teaches non-Latin alphabets like Cyrillic, Hebrew, Arabic, Chinese, etc... I have a library of non-sense words to help the student learn; the reason for non-sense words vs real words is that it breaks the steps down to just two letters at a time; and at first, there aren't

Google cloud speech API response : Parsing iOS

随声附和 提交于 2020-01-05 05:30:46
问题 I am trying to integrate google cloud speech API in my demo app. What I am getting as result is below : { results { alternatives { transcript: "hello" } stability: 0.01 } } Code to get response : [[SpeechRecognitionService sharedInstance] streamAudioData:self.audioData withCompletion:^(StreamingRecognizeResponse *response, NSError *error) { if (error) { NSLog(@"ERROR: %@", error); _textView.text = [error localizedDescription]; [self stopAudio:nil]; } else if (response) { BOOL finished = NO; /

Google cloud speech API response : Parsing iOS

不问归期 提交于 2020-01-05 05:30:27
问题 I am trying to integrate google cloud speech API in my demo app. What I am getting as result is below : { results { alternatives { transcript: "hello" } stability: 0.01 } } Code to get response : [[SpeechRecognitionService sharedInstance] streamAudioData:self.audioData withCompletion:^(StreamingRecognizeResponse *response, NSError *error) { if (error) { NSLog(@"ERROR: %@", error); _textView.text = [error localizedDescription]; [self stopAudio:nil]; } else if (response) { BOOL finished = NO; /

using enhanced model in google cloud speech api

喜夏-厌秋 提交于 2020-01-04 09:13:23
问题 I'm trying to use the enhanced models on the Google Speech API like: gcs_uri="gs://mybucket/averylongaudiofile.ogg" client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.OGG_OPUS, language_code='en-US', sample_rate_hertz=48000, use_enhanced=True, model='phone_call', enable_word_time_offsets=True, enable_automatic_punctuation=True) operation = client.long_running_recognize(config, audio) I