speech-recognition

Acoustic training using SAPI 5.3 Speech API

萝らか妹 提交于 2019-11-27 13:20:58
Using Microsoft's SAPI 5.3 Speech API on Vista, how do you programatically do acoustic model training of a RecoProfile? More concretely, if you have a text file, and an audio file of a user speaking that text, what sequence of SAPI calls would you make to train the user's profile using that text and audio? Update: More information about this problem I still haven't solved: You call ISpRecognizer2.SetTrainingState( TRUE, TRUE ) at "the beginning" and ISpRecognizer2.SetTrainingState( FALSE, TRUE ) at "the end." But it is still unclear just when those actions have to happen relative to other

Mac OS X speech to text API. Howto?

自古美人都是妖i 提交于 2019-11-27 12:32:11
问题 I have a program that receives an audio (mono) stream of bits from TCP/IP. I am wondering whether the speech (speech-recognition) API in Mac OS X would be able to do a speech-to-text transform for me. (I don't mind saving the audio into .wav first and read it as oppose to do the transform on the fly). I have read the official docs online, it is a bit confusing. And I couldn't find any good example about this topic. Also, should I do it in Cocoa/Carbon/Java or Objective-C? Can someone please

Google's voice search speech recognition service

徘徊边缘 提交于 2019-11-27 11:57:37
Google has speech recognition services available for use from mobile phones (Android has it built in, iPhone users can use the Google application) - http://www.google.com/mobile/ . We've found one article where someone tried to reverse engineer the service at http://waxy.org/2008/11/deconstructing_google_mobiles_voice_search_on_the_iphone/ . We want to better understand what is happening over the network when we use Android's RecognizerIntent. Does anyone have any experience using this service over the web or know of other articles that may explain its workings? I read this presentation few

Android speech recognizing and audio recording in the same time

99封情书 提交于 2019-11-27 11:57:11
My application records audio using MediaRecorder class in AsyncTask and also use Google API transform speech to text - Recognizer Intent - using the code from this question : How can I use speech recognition without the annoying dialog in android phones I have tried also to record audio in Thread, but this is worse solution. It causes more problems. My problem is that my application works properly on emulator. But emulator don't supports speech reocognition because of lack of voice recognition services. And on my device my application has crash when I starts recording audio and speech

How to mix Grammar (Rules) & Dictation (Free speech) with SpeechRecognizer in C#

好久不见. 提交于 2019-11-27 11:46:53
问题 I really like Microsofts latest speech recognition (and SpeechSynthesis) offerings. http://msdn.microsoft.com/en-us/library/ms554855.aspx http://estellasays.blogspot.com/2009/04/speech-recognition-in-cnet.html However I feel like I'm somewhat limited when using grammars. Don't get me wrong grammars are great for telling the speech recognition exactly what words / phrases to look out for, however what if I want it to recognise something i've not given it a heads up about? Or I want to parse a

Android App Integrated with OK Google

爷,独闯天下 提交于 2019-11-27 11:42:30
Is there a way to issue a voice command something like: OK GOOGLE ASK XXX Some App Specific Question or Command And have it launch "APP" with the recognized text: "Some App Specific Question or Command" My app has speech recognition as a service ... but when using my APP I can't ask questions that OK Google can handle ... ianhanniballake Through the Voice Actions API , your app can register for system actions , one of which is 'search' (so you could do 'search for Some Question or command on APP'). In the past, some developers were able to submit a custom voice action request . Upon approval,

Convert audio to text

自闭症网瘾萝莉.ら 提交于 2019-11-27 11:21:00
问题 I just want to know if there is any build in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it. I need to make an application to do so, but I don't know from where I can start. 回答1: Here are some of your options: Microsoft Speech Lumenvox Dragon naturally speaking sphinx4 回答2: Here is a complete example using C# and System.Speech The code can be divided into 2 main parts: configuring the SpeechRecognitionEngine object

Google speech API [closed]

我的未来我决定 提交于 2019-11-27 11:14:58
I'm now working with my project and I'm about to build a Siri-like application for the desktop computer. I am thinking if Google Speech API is reliable and accurate for speech recognition? Can you suggest to me what speech API is the most accurate in terms of speech recognition? Most preferably a free API. Thank you. Kevin Junghans While the Google speech API is free it is not an official public API. Some people have reverse engineered it, as is discussed in this blog . If you are planning on accessing the API directly for a commercial product I would not recommend it because they can drop it

How to use Speech Recognition inside the iOS SDK? [closed]

一世执手 提交于 2019-11-27 10:31:54
I know that there is no public API for the SIRI-Services, but is there an API for simple Speech-Recognition? So if I have a textfield and the user taps onto that textfield, a keyboard with the typically microphone button appears and if he pressed it the speech get recognized and transformed into a string object? Or is this button maybe presented by default? Nishant Tyagi There are many libraries availble. You can use any of them. openears // This is the best library VocalKit (Deprecated for open ears) TTS ispeech (Not free) Hope it helps you. NOTE : if you download openears ( which contains a

Speech Recognition API

送分小仙女□ 提交于 2019-11-27 10:11:27
问题 I need to automatically transcribe some short MP3s as part of a proof of concept I am working on. I am currently looking into cloud solutions or web API services to send the MP3 as a simple HTTP request and receive a transcription back. The only free/open source solution I have found here, but the demos don't seem to work (at least not on the files I need to transcribe). I have found some enterprise solutions for call centers, but so far nothing I can simply integrate into a project. Are