google-speech-api | 易学教程

NodeJS Convert Int16Array binary Buffer to LINEAR16 encoded raw stream for Google Speech API

阅读更多关于 NodeJS Convert Int16Array binary Buffer to LINEAR16 encoded raw stream for Google Speech API

问题 I'm trying to convert speech to text in node server where speech recording happens in the browser using AudioContext. I'm Able to send int16Array buffer(recorded data) to my node server through a WebSocket connection of binaryType:arraybuffer. this.processor.onaudioprocess = (e) => { // this.processAudio(e) for ( var float32Array = e.inputBuffer.getChannelData(0) || new Float32Array(this.bufferSize), len = float32Array.length, int16Array = new Int16Array(len); len--;) int16Array[len] = 32767

Google-speech-api transcribing spoken numbers incorrectly

阅读更多关于 Google-speech-api transcribing spoken numbers incorrectly

问题 I started using google speech api to transcribe audio. The audio being transcribed contains many numbers spoken one after the other. E.g. 273 298 But the transcription comes back 270-3298 My guess is that it is interpreting it as some sort of phone number. What i want is unparsed output e.g. "two seventy three two ninety eight' which i can deal with and parse on my own. Is there a setting or support for this kind of thing? thanks 回答1: So I had this exact same problem and I think we found a

Base64 decoding failed for Google Speech API

阅读更多关于 Base64 decoding failed for Google Speech API

问题 I tried to send a POST request to https://speech.googleapis.com/v1/speech:recognize using the JSON and the code fragment below. Somehow google responsed that fail to decoding Base 64 in my request. { "config": { "encoding": "LINEAR16", "sampleRateHertz": 16000, "languageCode": "ja-JP", "maxAlternatives": 5, "profanityFilter": false }, "audio": { "content": "ZXCVBNM" }, } String pcmFilePath = "/storage/emulated/0/Download/voice8K16bitmono.pcm"; File rawFile = new File(pcmFilePath); byte[]

Google Speech API returns NULL

阅读更多关于 Google Speech API returns NULL

问题 Trying to develop a speech to text application using Google's API with below code import java.io.BufferedReader; import java.io.DataOutputStream; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.URL; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import org.testng.annotations.Test; public class Speech2Text_Test { @Test public void f() { try{ Path path = Paths.get("out.flac"); byte[] data = Files.readAllBytes(path); String

Google Cloud Speech API - certificate verify failed in Python

阅读更多关于 Google Cloud Speech API - certificate verify failed in Python

问题 I'm using SpeechRecognition library. import speech_recognition as sr AUDIO_FILE = 'test_audio.wav' with open("api-key.json") as f: GOOGLE_CLOUD_SPEECH_CREDENTIALS = f.read() r = sr.Recognizer() with sr.AudioFile(AUDIO_FILE) as source: audio = r.record(source) print('Starting recognition...') print(r.recognize_google_cloud(audio, credentials_json=GOOGLE_CLOUD_SPEECH_CREDENTIALS)) print('Completed') When above code is run, an error occurs - ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED]

Google Speech API streaming audio exceeding 1 minute

阅读更多关于 Google Speech API streaming audio exceeding 1 minute

问题 I would like to be able to extract utternaces of a person from a stream of telephone audio. The phone audio is routed to my server which then creates a streaming recognition request. How can I tell when a word exists as part of a complete utterance or is part of an utterance currently being transcribed? Should I compare timestamps between words? Will the API continue to return interim results even if there is no speech for a certain amount of time in the streaming phone audio? How can I

Google Speech Cloud error on Android: OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds

阅读更多关于 Google Speech Cloud error on Android: OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds

问题 First: I already know there is a 65 second limit on continuous speech recognition streaming with this API. My goal is NOT to extend those 65 seconds. My app: It uses Google's streaming Speech Recognition, I based my code on this example: https://github.com/GoogleCloudPlatform/android-docs-samples/tree/master/speech The app works fairly well, I get ASR results and show them onscreen as the user speaks, Siri style. The problem: My problem comes after tapping the ASR button on my app several,

How do I use Google Speech API to access a file in Google Cloud Storage?

阅读更多关于 How do I use Google Speech API to access a file in Google Cloud Storage?

问题 I am using Visual Studio 2019 on Windows 10 for a .NET Console C# project using Google Speech API. I have the following code: class Program { static void Main(string[] args) { var URI = "https://speech.googleapis.com/v1/speech:recognize?key=AIzaSyANbpQ1iy-Ced72r7xgPVHuNZI5FAVIPjY&audio=audio.flac"; Console.WriteLine("Start!"); AsyncRecognizeGcs(URI); Console.WriteLine("End."); } static object AsyncRecognizeGcs(string storageUri) { var speech = SpeechClient.Create(); var longOperation = speech

Cannot call SpeechClient.recognize(RecognizeRequest request): Throwing Exception

阅读更多关于 Cannot call SpeechClient.recognize(RecognizeRequest request): Throwing Exception

问题 This is my first time posting, so I'm not too familiar with the rules, but here goes. I've been trying to get the Google Cloud Speech API to work on Android, but to no avail. The same code works just fine on Java, but not on Android. My code runs fine until I call the recognize method, using a speech client. Here is the error: 11-02 18:38:03.922 6959-6982/capstone.speechrecognitionsimple E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1 Process: capstone.speechrecognitionsimple, PID: 6959 java

Specify Region for Google Speech API?

阅读更多关于 Specify Region for Google Speech API?

问题 We are using Google Speech API as part of our service. Due to new GDPR rules we have to make sure none of our data leaves the EU. All other services seems to be able to specify a region including Google Cloud Storage. However, I haven't been able to find any documentation related to Google Speech API. Anybody know if it is possible to specify a region for Google Speech API to avoid sending our data outside the EU? 回答1: Found my answer: https://cloud.google.com/about/locations/?region=europe