google-cloud-speech | 易学教程

using enhanced model in google cloud speech api

阅读更多关于 using enhanced model in google cloud speech api

问题 I'm trying to use the enhanced models on the Google Speech API like: gcs_uri="gs://mybucket/averylongaudiofile.ogg" client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.OGG_OPUS, language_code='en-US', sample_rate_hertz=48000, use_enhanced=True, model='phone_call', enable_word_time_offsets=True, enable_automatic_punctuation=True) operation = client.long_running_recognize(config, audio) I

using enhanced model in google cloud speech api

阅读更多关于 using enhanced model in google cloud speech api

Google Speech-to-text API, InvalidArgument: 400 Must use single channel (mono)

阅读更多关于 Google Speech-to-text API, InvalidArgument: 400 Must use single channel (mono)

问题 I keep getting this error InvalidArgument: 400 in google Speech-to-text, and the problem seems to be that I an using a 2 channel audio(Stereo), and the API is waiting for a wav in (Mono). If I convert the file in a audio editor it might work, but I cannot use an audio editor to convert a batch of files. Is there a way to change the Audio type in either Python or Google Cloud. Note: I already tried with the "wave module" but I kept getting an error #7 for file type not recognize(I couldn't

Cannot import com.google.cloud.speech.v1.SpeechGrpc in Android

阅读更多关于 Cannot import com.google.cloud.speech.v1.SpeechGrpc in Android

问题 I'm trying to use Google's Speech API in Android project. The example project works. I'm having trouble to use it in my own android app. build.gradle(Module:app) : apply plugin: 'com.android.application' apply plugin: 'com.google.protobuf' ext { supportLibraryVersion = '25.4.0' grpcVersion = '1.4.0' } android { compileSdkVersion 25 buildToolsVersion "25.0.3" defaultConfig { applicationId "ApplicationID" minSdkVersion 16 targetSdkVersion 24 // compileOptions { // sourceCompatibility

What types of audio are supported by Cloud Speech API?

阅读更多关于 What types of audio are supported by Cloud Speech API?

问题 There are a lot of audio formats (e.g., mp3, m4a), sources (e.g., dictation, commands, phone calls, meetings) and devices (e.g., phones, PCs, IoT devices). Which ones work best with Cloud Speech API? 回答1: Which ones work best with Cloud Speech API? Supported ones shall work best: LINEAR16 Uncompressed 16-bit signed little-endian samples. This is the only encoding that may be used by speech.asyncrecognize. FLAC This is the recommended encoding for speech.syncrecognize and StreamingRecognize

What types of audio are supported by Cloud Speech API?

阅读更多关于 What types of audio are supported by Cloud Speech API?

Failed to load libraries: [netty_tcnative_linux_arm_32, netty_tcnative_linux_arm_32_fedora, netty_tcnative_arm_32, netty_tcnative]

阅读更多关于 Failed to load libraries: [netty_tcnative_linux_arm_32, netty_tcnative_linux_arm_32_fedora, netty_tcnative_arm_32, netty_tcnative]

问题 I am trying to run a java application using jar on raspberry pi modal 3. I am unable to resolve this issue. Could someone kindly suggest how can I make this work on the raspberry pi? In pom, I have included google-cloud-speech dependency, 0.56.0-beta; and spring-boot-starter-web dependency. Error: java.lang.IllegalArgumentException: Failed to load any of the given libraries: [netty_tcnative_linux_arm_32, netty_tcnative_linux_arm_32_fedora, netty_tcnative_arm_32, netty_tcnative] at io.grpc

how to read mp3 data from google cloud using python

阅读更多关于 how to read mp3 data from google cloud using python

问题 I am trying to read mp3/wav data from google cloud and trying to implement audio diarization technique. Issue is that I am not able to read the result which has passed by google api in variable response. below is my python code speech_file = r'gs://pp003231/a4a.wav' config = speech.types.RecognitionConfig( encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16, language_code='en-US', enable_speaker_diarization=True, diarization_speaker_count=2) audio = speech.types.RecognitionAudio

Google Cloud Speech API add SpeechContext

阅读更多关于 Google Cloud Speech API add SpeechContext

问题 I would like to add some keywords to my app, so the API can recognize more efficiently the spoken words. For example I m having trouble recognizing the some Italian words that starts with E ,(E` per me) for example. Or in German (er geht). Here is my code: public void recognize (int sampleRate) { if (mApi == null) { Log.w(TAG, "API not ready. Ignoring the request."); return; } // Configure the API mRequestObserver = mApi.streamingRecognize(mResponseObserver); mRequestObserver.onNext

Google Cloud Speech to Text API - Speaker Diarization

阅读更多关于 Google Cloud Speech to Text API - Speaker Diarization

问题 When i am trying to do a speech to text transcribe of a live phone call using web socket. Already included const Speech = require('@google-cloud/speech').v1p1beta1; const speech = new Speech.SpeechClient(); With following config. encoding: 'LINEAR16', sampleRateHertz: 8000, languageCode: 'en-US', useEnhanced: true, enableSpeakerDiarization: true, diarizationSpeakerCount: 2, enableWordConfidence: true, model: `phone_call`, I am getting following response { "results": [ { "alternatives": [ {