google-cloud-speech | 易学教程

com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Credentials failed to obtain metadata :cloud speech

阅读更多关于 com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Credentials failed to obtain metadata :cloud speech

问题 i've done a speech to text configuration using google's cloud speech api in java, this works on other machine but the same setup does not work on my machine. i've installed google cloud platform tools from eclipse market place also i've set the environment credential variable for run time. can anyone please help ? 来源： https://stackoverflow.com/questions/59845869/com-google-api-gax-rpc-unavailableexception-io-grpc-statusruntimeexception-una

Google Cloud Speech-to-Text: “INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..” codec audio encoding error

阅读更多关于 Google Cloud Speech-to-Text: “INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..” codec audio encoding error

问题 I'm recording short audio files (a few seconds) in Chrome using mediaDevices.getUserMedia() , saving the file to Firebase Storage, and then trying to send the files to Google Cloud Speech-to-Text from a Firebase Cloud Function. I'm getting back this error message: INVALID_ARGUMENT: Invalid recognition 'config': bad encoding. Google's documentation says that this error message means Your audio data might not be encoded correctly or is encoded with a codec different than what you've declared in

Speaker Diarization support in Google Speech API

阅读更多关于 Speaker Diarization support in Google Speech API

问题 Is Google Cloud Speech API support speaker Diarization? as like Watson ? If so what the steps to get the transcript with speaker labled? More Info: https://www.ibm.com/blogs/watson/2016/12/look-whos-talking-ibm-debuts-watson-speech-text-speaker-diarization-beta/ 回答1: Google has introduced this feature and as on writing it's on BETA. More info - https://cloud.google.com/speech-to-text/ 回答2: Per the group discussion at Recording, Splitting Audio for Transcribing Two People Conversation using

Speaker Diarization support in Google Speech API

阅读更多关于 Speaker Diarization support in Google Speech API

Multiple StreamingRecognizeRequest

阅读更多关于 Multiple StreamingRecognizeRequest

问题 I'm trying to setup a StreamingRecognize , with multiple request's. Is it possible ? The point is that i want to send audio stream from the mic with a unknown time, so i think that i must implement multiple requests. (Considering that a request session has a max_time = 65 seconds). Anyone can help me with this ? Thank's alot ;) Google sample code: static async Task<object> StreamingMicRecognizeAsync(int seconds) { if (NAudio.Wave.WaveIn.DeviceCount < 1) { Console.WriteLine("No microphone!");

Google Cloud Speech API capability for non-sense words or phonetics

阅读更多关于 Google Cloud Speech API capability for non-sense words or phonetics

问题 Is is possible for the API to return the phonetics of what the sound file says? Or, is it possible to provide non-real vocabulary words? I have a foreign language tutorial where I might be able to use this. It for examples teaches non-Latin alphabets like Cyrillic, Hebrew, Arabic, Chinese, etc... I have a library of non-sense words to help the student learn; the reason for non-sense words vs real words is that it breaks the steps down to just two letters at a time; and at first, there aren't

Google Cloud Speech API capability for non-sense words or phonetics

阅读更多关于 Google Cloud Speech API capability for non-sense words or phonetics

Google cloud speech API response : Parsing iOS

阅读更多关于 Google cloud speech API response : Parsing iOS

问题 I am trying to integrate google cloud speech API in my demo app. What I am getting as result is below : { results { alternatives { transcript: "hello" } stability: 0.01 } } Code to get response : [[SpeechRecognitionService sharedInstance] streamAudioData:self.audioData withCompletion:^(StreamingRecognizeResponse *response, NSError *error) { if (error) { NSLog(@"ERROR: %@", error); _textView.text = [error localizedDescription]; [self stopAudio:nil]; } else if (response) { BOOL finished = NO; /

Google cloud speech API response : Parsing iOS

阅读更多关于 Google cloud speech API response : Parsing iOS

using enhanced model in google cloud speech api

阅读更多关于 using enhanced model in google cloud speech api

问题 I'm trying to use the enhanced models on the Google Speech API like: gcs_uri="gs://mybucket/averylongaudiofile.ogg" client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.OGG_OPUS, language_code='en-US', sample_rate_hertz=48000, use_enhanced=True, model='phone_call', enable_word_time_offsets=True, enable_automatic_punctuation=True) operation = client.long_running_recognize(config, audio) I