speech-to-text

Python speech recognition error converting mp3 file

跟風遠走 提交于 2021-02-16 08:51:02
问题 My first try on audio to text. import speech_recognition as sr r = sr.Recognizer() with sr.AudioFile("/path/to/.mp3") as source: audio = r.record(source) When I execute the above code, the following error occurs, <ipython-input-10-72e982ecb706> in <module>() ----> 1 with sr.AudioFile("/home/yogaraj/Documents/Python workouts/Python audio to text/show_me_the_meaning.mp3") as source: 2 audio = sr.record(source) 3 /usr/lib/python2.7/site-packages/speech_recognition/__init__.pyc in __enter__(self)

Python speech recognition error converting mp3 file

拜拜、爱过 提交于 2021-02-16 08:47:32
问题 My first try on audio to text. import speech_recognition as sr r = sr.Recognizer() with sr.AudioFile("/path/to/.mp3") as source: audio = r.record(source) When I execute the above code, the following error occurs, <ipython-input-10-72e982ecb706> in <module>() ----> 1 with sr.AudioFile("/home/yogaraj/Documents/Python workouts/Python audio to text/show_me_the_meaning.mp3") as source: 2 audio = sr.record(source) 3 /usr/lib/python2.7/site-packages/speech_recognition/__init__.pyc in __enter__(self)

Static IP to access GCP Machine Learning APIs via gRPC stream over HTTP/2

♀尐吖头ヾ 提交于 2021-02-11 13:01:44
问题 We're living behind a corporate proxy/firewall, that can only consume static IP rules and not FQDNs. For our project, we need to access Google Speech To Text API: https://speech.googleapis.com . If outside of corporate network, we use gRPC stream over HTTP/2 to do that. The ideal scenario looks like: Corporate network -> static IP in GCP -> forwarded gRPC stream to speech.googleapis.com What we have tried is creating a global static external IP, but failed when configuring the Load Balancer,

How to reconstruct a conversation from Watson Speech-to-Text output?

那年仲夏 提交于 2021-02-11 12:53:28
问题 I have the JSON output from Watson's Speech-to-Text service that I have converted into a list and then into a Pandas data-frame. I'm trying to identify how to reconstruct the conversation (with timings) akin to the following: Speaker 0: Said this [00.01 - 00.12] Speaker 1: Said that [00.12 - 00.22] Speaker 0: Said something else [00.22 - 00.56] My data-frame has a row for each word, and columns for the word, its start/end time, and the speaker tag (either 0 or 1). words = [['said', 0.01, 0.06

Google speech to text node-record-lpcm16 stream error

梦想的初衷 提交于 2021-02-11 12:27:09
问题 I am setting up google's speech-to-text in a node/express environment on Google App Engine. I have an Angular app that communicates to the server via websockets. This all works perfectly on local host, but when my angular app points to the App Engine instance it does not. It can connect fine - sends connection msg back/forth. And it runs my google speech connection fine. However I get an error in the bit where i try to access the mic stream. The error message isn't much use: ERROR with

Google speech to text node-record-lpcm16 stream error

我是研究僧i 提交于 2021-02-11 12:26:14
问题 I am setting up google's speech-to-text in a node/express environment on Google App Engine. I have an Angular app that communicates to the server via websockets. This all works perfectly on local host, but when my angular app points to the App Engine instance it does not. It can connect fine - sends connection msg back/forth. And it runs my google speech connection fine. However I get an error in the bit where i try to access the mic stream. The error message isn't much use: ERROR with

Change default language for Speech recognition in my app

北城以北 提交于 2021-02-08 07:22:32
问题 I make an app in English. My app uses Speech recognition. But if I install this app on device with another system language, French or Russian for example. My speech recognition doesn't work. It works only for language which by default in system. How can I make English language for Speech recognition by default for my app? I found this method but it doesn't work Locale myLocale; myLocale = new Locale("English (US)", "en_US"); Locale.setDefault(myLocale); android.content.res.Configuration

Speech-to-text large audio files [Microsoft Speech API]

时光怂恿深爱的人放手 提交于 2021-02-07 18:42:46
问题 What is the best way to transcribe medium/large audio files, ~ 6-10 mins each file, using Microsoft Speech API? Something like batch audio files transcription? I have used the code provided in https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text-sample, for continuously transcribing speech, but it stops transcribing at some point. Is there any restriction on the transcription? I am only using the free trial account atm. Btw, I assume there is no difference

pocketsphinx - how to switch from keyword spotting to grammar mode

回眸只為那壹抹淺笑 提交于 2021-02-07 09:01:02
问题 I'm using pocketsphinx with raspberry pi for home automation. I've written a simple JSGF grammar file with the supported commands. Now, I want to use an activation phrase such as "hey computer" prior to the commands, to avoid false detections and only perform speech recognition once the activation phrase has been spoken. If I'm not getting this wrong, pocketsphinx supports two modes for speech recognition: keyword spotting mode, and language model / JSGF grammar mode. In pocketsphinx FAQ when

Azure Speech SDK Speech to text from stream using python

允我心安 提交于 2021-01-29 20:42:21
问题 I am trying to send the stream from UI to python API as stream. I need python Azure Speech logic to convert the speech to text. I am not sure about how to use pull/pusha audio input stream for speech to text 回答1: There is a sample for using cognitive services speech sdk. Specifically, for using it with pull stream, you may refer to: speech_recognition_with_pull_stream() , and for using it with push stream, you may refer to: speech_recognition_with_push_stream(). Hope it helps. 回答2: In my case