google-cloud-speech

using enhanced model in google cloud speech api

杀马特。学长 韩版系。学妹 提交于 2020-01-04 09:13:01
问题 I'm trying to use the enhanced models on the Google Speech API like: gcs_uri="gs://mybucket/averylongaudiofile.ogg" client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.OGG_OPUS, language_code='en-US', sample_rate_hertz=48000, use_enhanced=True, model='phone_call', enable_word_time_offsets=True, enable_automatic_punctuation=True) operation = client.long_running_recognize(config, audio) I

using enhanced model in google cloud speech api

点点圈 提交于 2020-01-04 09:12:28
问题 I'm trying to use the enhanced models on the Google Speech API like: gcs_uri="gs://mybucket/averylongaudiofile.ogg" client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.OGG_OPUS, language_code='en-US', sample_rate_hertz=48000, use_enhanced=True, model='phone_call', enable_word_time_offsets=True, enable_automatic_punctuation=True) operation = client.long_running_recognize(config, audio) I

Google Speech-to-text API, InvalidArgument: 400 Must use single channel (mono)

萝らか妹 提交于 2020-01-02 13:30:09
问题 I keep getting this error InvalidArgument: 400 in google Speech-to-text, and the problem seems to be that I an using a 2 channel audio(Stereo), and the API is waiting for a wav in (Mono). If I convert the file in a audio editor it might work, but I cannot use an audio editor to convert a batch of files. Is there a way to change the Audio type in either Python or Google Cloud. Note: I already tried with the "wave module" but I kept getting an error #7 for file type not recognize(I couldn't

Cannot import com.google.cloud.speech.v1.SpeechGrpc in Android

送分小仙女□ 提交于 2020-01-01 04:44:05
问题 I'm trying to use Google's Speech API in Android project. The example project works. I'm having trouble to use it in my own android app. build.gradle(Module:app) : apply plugin: 'com.android.application' apply plugin: 'com.google.protobuf' ext { supportLibraryVersion = '25.4.0' grpcVersion = '1.4.0' } android { compileSdkVersion 25 buildToolsVersion "25.0.3" defaultConfig { applicationId "ApplicationID" minSdkVersion 16 targetSdkVersion 24 // compileOptions { // sourceCompatibility

What types of audio are supported by Cloud Speech API?

痞子三分冷 提交于 2019-12-25 09:48:32
问题 There are a lot of audio formats (e.g., mp3, m4a), sources (e.g., dictation, commands, phone calls, meetings) and devices (e.g., phones, PCs, IoT devices). Which ones work best with Cloud Speech API? 回答1: Which ones work best with Cloud Speech API? Supported ones shall work best: LINEAR16 Uncompressed 16-bit signed little-endian samples. This is the only encoding that may be used by speech.asyncrecognize. FLAC This is the recommended encoding for speech.syncrecognize and StreamingRecognize

What types of audio are supported by Cloud Speech API?

时间秒杀一切 提交于 2019-12-25 09:48:13
问题 There are a lot of audio formats (e.g., mp3, m4a), sources (e.g., dictation, commands, phone calls, meetings) and devices (e.g., phones, PCs, IoT devices). Which ones work best with Cloud Speech API? 回答1: Which ones work best with Cloud Speech API? Supported ones shall work best: LINEAR16 Uncompressed 16-bit signed little-endian samples. This is the only encoding that may be used by speech.asyncrecognize. FLAC This is the recommended encoding for speech.syncrecognize and StreamingRecognize

Failed to load libraries: [netty_tcnative_linux_arm_32, netty_tcnative_linux_arm_32_fedora, netty_tcnative_arm_32, netty_tcnative]

元气小坏坏 提交于 2019-12-25 03:44:31
问题 I am trying to run a java application using jar on raspberry pi modal 3. I am unable to resolve this issue. Could someone kindly suggest how can I make this work on the raspberry pi? In pom, I have included google-cloud-speech dependency, 0.56.0-beta; and spring-boot-starter-web dependency. Error: java.lang.IllegalArgumentException: Failed to load any of the given libraries: [netty_tcnative_linux_arm_32, netty_tcnative_linux_arm_32_fedora, netty_tcnative_arm_32, netty_tcnative] at io.grpc

how to read mp3 data from google cloud using python

谁都会走 提交于 2019-12-25 02:44:36
问题 I am trying to read mp3/wav data from google cloud and trying to implement audio diarization technique. Issue is that I am not able to read the result which has passed by google api in variable response. below is my python code speech_file = r'gs://pp003231/a4a.wav' config = speech.types.RecognitionConfig( encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16, language_code='en-US', enable_speaker_diarization=True, diarization_speaker_count=2) audio = speech.types.RecognitionAudio

Google Cloud Speech API add SpeechContext

∥☆過路亽.° 提交于 2019-12-23 05:42:08
问题 I would like to add some keywords to my app, so the API can recognize more efficiently the spoken words. For example I m having trouble recognizing the some Italian words that starts with E ,(E` per me) for example. Or in German (er geht). Here is my code: public void recognize (int sampleRate) { if (mApi == null) { Log.w(TAG, "API not ready. Ignoring the request."); return; } // Configure the API mRequestObserver = mApi.streamingRecognize(mResponseObserver); mRequestObserver.onNext

Google Cloud Speech to Text API - Speaker Diarization

与世无争的帅哥 提交于 2019-12-23 04:08:04
问题 When i am trying to do a speech to text transcribe of a live phone call using web socket. Already included const Speech = require('@google-cloud/speech').v1p1beta1; const speech = new Speech.SpeechClient(); With following config. encoding: 'LINEAR16', sampleRateHertz: 8000, languageCode: 'en-US', useEnhanced: true, enableSpeakerDiarization: true, diarizationSpeakerCount: 2, enableWordConfidence: true, model: `phone_call`, I am getting following response { "results": [ { "alternatives": [ {