google-speech-api | 易学教程

Splitting an Ogg Opus File stream

阅读更多关于 Splitting an Ogg Opus File stream

问题 I am trying to send an OGG_OPUS encoded stream to google's speech to text streaming service. Since there is a time limit imposed by Google for their stream requests, I have to route the audio stream to another Google Speech To Text streaming session on a fixed interval. From what I've read, the pages in the OGG stream cannot be read independently since the data in the pages are calculated by considering the data of the previous and next pages. If that is the case, can we cut off the stream at

How to Google Speech-to-Text using Blob sent from Browser to Nodejs Server

阅读更多关于 How to Google Speech-to-Text using Blob sent from Browser to Nodejs Server

问题 I am trying to set up a server to receive audio from a client browser using SocketIO , then process it through Google Speech-to-Text, and finally reply back to the client with the text. Originally and ideally, I wanted to set up to function somewhat like the tool on this page: https://cloud.google.com/speech-to-text/ I tried using getUserMedia and streaming it through SocketIO-Stream , but I couldn't figure out how to 'pipe' MediaStream . Instead, now I've decided to use MediaRecorder on the

How to Google Speech-to-Text using Blob sent from Browser to Nodejs Server

阅读更多关于 How to Google Speech-to-Text using Blob sent from Browser to Nodejs Server

Whats the best way to use google credentials for production app?

阅读更多关于 Whats the best way to use google credentials for production app?

问题 I'm building a C# .net application for STT and I'm creating credentials manually. I find the documentation hugely confusing for me and I dont know how to add the credentials properly. I added a project, created a json credential and downloaded and kept on a folder and pointing to it for manually with GoogleCredential for authorization and everythings working good. But this cant be a solution for a shipped app. Current approach: GoogleCredential credentials = GoogleCredential.FromFile(Path

Error while importing google cloud videintelligence: ImportError: cannot import name 'init_grpc_aio' from 'grpc._cython.cygrpc'

阅读更多关于 Error while importing google cloud videintelligence: ImportError: cannot import name 'init_grpc_aio' from 'grpc._cython.cygrpc'

来源： https://stackoverflow.com/questions/63434140/error-while-importing-google-cloud-videintelligence-importerror-cannot-import

Error while importing google cloud videintelligence: ImportError: cannot import name 'init_grpc_aio' from 'grpc._cython.cygrpc'

阅读更多关于 Error while importing google cloud videintelligence: ImportError: cannot import name 'init_grpc_aio' from 'grpc._cython.cygrpc'

来源： https://stackoverflow.com/questions/63434140/error-while-importing-google-cloud-videintelligence-importerror-cannot-import

google cloud speech ImportError: cannot import name 'enums'

阅读更多关于 google cloud speech ImportError: cannot import name 'enums'

问题 I'm using google-cloud-speech api for my project . I'm using pipenv for virtual environment i installed google-cloud-speech api with pipenv install google-cloud-speech and pipenv update google-cloud-speech i followed this docs https://cloud.google.com/speech-to-text/docs/reference/libraries This is my code: google.py: # !/usr/bin/env python # coding: utf-8 import argparse import io import sys import codecs import datetime import locale import os from google.cloud import speech_v1 as speech

'Audio data must be audio data' error with google speech recognition in python

阅读更多关于 'Audio data must be audio data' error with google speech recognition in python

问题 I am trying to load an audio file in python and process it with google speech recognition The problem is that unlike in C++, python doesn't show data types, classes, or give you access to memory to convert between one data type and another by creating a new object and repacking data I dont understand how it's possible to convert from one data type to another in python The code in question is below, import speech_recognition as spr import librosa audio, sr = librosa.load('sample_data/metal.mp3

'Audio data must be audio data' error with google speech recognition in python

阅读更多关于 'Audio data must be audio data' error with google speech recognition in python

Can the Google Speech API be configured to return only numbers / letters?

阅读更多关于 Can the Google Speech API be configured to return only numbers / letters?

问题 Can the Google Speech API be configured to only return numbers and letters, as opposed to full words? The use case is translating Canadian postal codes. Ex. M 1 B 0 R 3. Google may return "Em 1 Be 0 Are 3" We have tried: Using speechContexts and feeding in letters A - Z, as individual phrases. This improved the accuracy for us. We did not have much success passing in individual numbers (ex 1, 2, 3). Specifying the codec and sample rate of our WAV file using the encoding and sampleRateHertz