speech-recognition | 易学教程

Automate speech input recording in Chrome

阅读更多关于 Automate speech input recording in Chrome

问题 I'm trying to automate the recording of speech in Google's speech input (only works in Chrome). As it is, the user has to click the mic to start the recording but I'm working on an installation where the user won't interact with the computer. Thus I have to trigger the recording some other way. As far as it seems you can't access the speech input functionality by code, i.e. you can't call a function to start recording. So now I'm looking at simulating mouse click on the mic. I've tried using

Automate speech input recording in Chrome

阅读更多关于 Automate speech input recording in Chrome

I'm trying to automate the recording of speech in Google's speech input (only works in Chrome). As it is, the user has to click the mic to start the recording but I'm working on an installation where the user won't interact with the computer. Thus I have to trigger the recording some other way. As far as it seems you can't access the speech input functionality by code, i.e. you can't call a function to start recording. So now I'm looking at simulating mouse click on the mic. I've tried using javaScript but it seems only events and event handlers are affected (e.g. a simulated click on an input

Parse speech output to a JSON to call Application API

阅读更多关于 Parse speech output to a JSON to call Application API

Here is an idea: We have web applications with exposed restful APIs which accepts json. Now how about using google speech APIs to take user voice input convert it to text then somehow translate that text to JSONs required by APIs and then call those application APIe with JSON? Is there any. Library to translate text to a specified JSon format? Has anybody used this approach? This is called "intent analysis". There are such libraries, for example RASA For example you input is "show me chinese restaurants". The output would be { "text": "show me chinese restaurants", "intent": "restaurant_search

408 Request timed out Microsoft Speech to Text

阅读更多关于 408 Request timed out Microsoft Speech to Text

问题 My .wav file length is just 4 seconds. Even after multiple retries and running it on cloud i am constantly getting following error * upload completely sent off: 12 out of 12 bytes < HTTP/1.1 408 Request timed out (> 14000 ms) < Transfer-Encoding: chunked < Content-Type: text/plain < Server: Microsoft-IIS/8.5 < X-MSEdge-Ref: Has anybody faced this issue? This is my request `curl -v "https://speech.platform.bing.com/recognize? scenarios=catsearch&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5

Speech recognition (web) services?

阅读更多关于 Speech recognition (web) services?

问题 I have a buffer of audio and I'd like to perform speech recognition/transcription on it. I have limited CPU and RAM locally so I want to perform recognition on a server. Are there any (web) services that allow me to do this? My searches so far have led nowhere... 回答1: Google has just introduced browser-based access to its speech engine through HTML5. http://slides.html5rocks.com/#speech-input To get this page to work, I launched the Chromium browser as follows in Ubuntu: $ chromium-browser -

PlatformNotSupportedException Using .NET Speech Recognition

阅读更多关于 PlatformNotSupportedException Using .NET Speech Recognition

So I'm trying voice recognition for C#, I'm using System.Speech.Recognition, and, I was searching around on the internet, trying out several pieces of code for some basic speech recognition, the best one I could find was this: using System; using System.Text; using System.Windows.Forms; using System.Speech.Recognition; namespace SpeechRecognition { public partial class MainForm : Form { SpeechRecognitionEngine recognitionEngine; public MainForm() { InitializeComponent(); Initialize(); } private void Initialize() { recognitionEngine = new SpeechRecognitionEngine(); recognitionEngine

408 Request timed out Microsoft Speech to Text

阅读更多关于 408 Request timed out Microsoft Speech to Text

My .wav file length is just 4 seconds. Even after multiple retries and running it on cloud i am constantly getting following error * upload completely sent off: 12 out of 12 bytes < HTTP/1.1 408 Request timed out (> 14000 ms) < Transfer-Encoding: chunked < Content-Type: text/plain < Server: Microsoft-IIS/8.5 < X-MSEdge-Ref: Has anybody faced this issue? This is my request `curl -v "https://speech.platform.bing.com/recognize? scenarios=catsearch&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&locale=en- US&device.os=wp7&version=3.0&format=json&requestid=1d4b6030-9099-12e0-91e4- 0800200c9a67

Speech to Phoneme in .Net

阅读更多关于 Speech to Phoneme in .Net

The problem is that I want to get phonemes of a audio speech in C# language. say you have an audio file like "x.wav" that says "hello dear Shamim". i want to extract all the phonemes of the speech and their relative timings. something like the picture below: I used System.Speech library (both recognition and synthesis namespaces) but i didn't find what i wanted. Now don't be mistaken! I don't want the phonemes of the sentence "hello dear Shamim", i want to extract the phonemes from an unknown audio input that speaks and English sentence. I tried System.Speech.Recognition but it tries to

Improve Speech Recognition, C#

阅读更多关于 Improve Speech Recognition, C#

I use System.Speech library to able to recognize speech but it usually recognizes very different. SpeechRecognizer_rec = new SpeechRecognizer(); DictationGrammar grammar = new DictationGrammar(); grammar.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(grammar_SpeechRecognized); _rec.LoadGrammar(grammar); How can I improve the recgonition? Does it have a relation with Grammer class? If you can afford to ask users go to the training process that will certainly yield you much better results. I have used for myself (and I have an accent) and it improved significantly the accuracy

Speech to Phoneme in .Net

阅读更多关于 Speech to Phoneme in .Net

问题 The problem is that I want to get phonemes of a audio speech in C# language. say you have an audio file like "x.wav" that says "hello dear Shamim". i want to extract all the phonemes of the speech and their relative timings. something like the picture below: I used System.Speech library (both recognition and synthesis namespaces) but i didn't find what i wanted. Now don't be mistaken! I don't want the phonemes of the sentence "hello dear Shamim", i want to extract the phonemes from an unknown