speech-recognition | 易学教程

How to convert human voice into digital format?

阅读更多关于 How to convert human voice into digital format?

问题 I am working on a project where biometric system is used to secure the system. We are planning to use human voice to secure the system. Idea is to allow the person to say some words or sentences and system will store that voice in digital format. Next time person wants to enter the system, he/she has to speak some words which may or may not be different from the words used earlier. We don't want to match words but want to match voice frequency. I have read some research papers regarding this

Windows 10 Speech Recognition

阅读更多关于 Windows 10 Speech Recognition

问题 I want to create a WPF application in c# for windows 10. Now, the problem that i had with previous windows versions was that i'm italian and there isn't a support for speech recognition in italian. But now there is cortana. So, how can i use cortana's speech recognition engine for my application? If i simply use new SpeechRecognitionEngine(new CultureInfo("it-IT"))); it gives me an error, 'cause there isn't the simple recongition engine, so i have to use cortana's one. Hope you understood and

Google Speech Recognition API: timestamp for each word?

阅读更多关于 Google Speech Recognition API: timestamp for each word?

问题 It's possible to use Google's Speech recognition API to get a transcription for an audio file (WAV, MP3, etc.) by doing a request to http://www.google.com/speech-api/v2/recognize?... Example: I have said " one two three for five " in a WAV file. Google API gives me this: { u'alternative': [ {u'transcript': u'12345'}, {u'transcript': u'1 2 3 4 5'}, {u'transcript': u'one two three four five'} ], u'final': True } Question: is it possible to get the time (in seconds) at which each word has been

Speech recognition with Flash or Silverlight

阅读更多关于 Speech recognition with Flash or Silverlight

问题 I'm developing a web user interface to enter some information that is not very complex but needs to be loaded in real time. I think that the application could make use of speech recognition to facilitate the task. Te core of the interface is being built with Javascript and jQuery, but can easily include a flash or silverlight component. I believe that´s probably the way to go... I don't need to recognize everything that the user says, but only a few prerecorded commands. Also, I don't want

How to train a user who is using my code which implements system.speech and SpeechRecognitionEngine

阅读更多关于 How to train a user who is using my code which implements system.speech and SpeechRecognitionEngine

问题 I have already coded using the System.Speech.Recognition namespace and use a XML SRGS file for grammer and the SpeechRecognitionEngine. I want to be able to lead the user through a training of the words or phrases that are important for the app I have written. I have just seen and read this How to train SAPI I understand that this example uses the unmanaged API (this api exposes a little more) but is exactly the same as far as the engine is concerned. So if I now set up a form and follow the

Is there any way to send audio file to the speech-to-text recognition

阅读更多关于 Is there any way to send audio file to the speech-to-text recognition

问题 I want the Android speech recognition system analysing audio file and not the default incoming voice from microphone. Is there any way to do that ? Thank you. 回答1: I suppose it works in a similar way to the chrome api - http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ As he has mentioned you can convert the microphone file into a .flac file and send it to the speech api, and you will get the same result. So you can use SOX and convert it yourself. Hope it helps. Dias 回答2:

Android Continuous speech recognition returns ERROR_NO_MATCH too quickly

阅读更多关于 Android Continuous speech recognition returns ERROR_NO_MATCH too quickly

问题 I've tried to implement continuous SpeechRecognition mechanism. When I start speech recognition I get following messages in logcat: 06-05 12:22:32.892 11753-11753/com.aaa.bbb D/SpeechManager: startSpeechRecognition: 06-05 12:22:33.022 11753-11753/com.aaa.bbb D/SpeechManager: onError: Error 7 06-05 12:22:33.352 11753-11753/com.aaa.bbb D/SpeechManager: onReadyForSpeech: 06-05 12:22:33.792 11753-11753/com.aaa.bbb D/SpeechManager: onBeginningOfSpeech: Beginning 06-05 12:22:34.492 11753-11753/com

Android - Speech Recognition Limiting Listening Time

阅读更多关于 Android - Speech Recognition Limiting Listening Time

问题 I am using Google API for speech recognition, but want to limit listening time. For example two seconds. After two seconds even though user continue to speaking recognizer should stop listening. I tried some EXTRAs like EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS but it did not helped me. My full code is here, if anyone can help me, I will be appreciate public void promptSpeechInput() { /

Python having trouble accessing usb microphone using Gstreamer to perform speech recognition with Pocketsphinx on a Raspberry Pi

阅读更多关于 Python having trouble accessing usb microphone using Gstreamer to perform speech recognition with Pocketsphinx on a Raspberry Pi

问题 So python is acting like acting like it can't hear ANYTHING from my microphone at all. Here's the problem. I have a Python ( 2.7 ) script that is suppose to be using Gstreamer to access my microphone and do speech recognition for me via Pocketsphinx . I'm using Pulse Audio and my device is a Raspberry Pi . My microphone is a Playstation 3 Eye . Now off the bat, I have already gotten pocketsphinx_continuous to run correctly and recognize the words I have defined in my .dict and .lm files. The

TargetInvocationException when using SemanticResultKey

阅读更多关于 TargetInvocationException when using SemanticResultKey

问题 I want to build my grammar to accept multiple number. It has a bug when I repeat the number like saying 'twenty-one'. So I kept reducing my code to find the problem. I reached the following piece of code for the grammar builder: string[] numberString = { "one" }; Choices numberChoices = new Choices(); for (int i = 0; i < numberString.Length; i++) { numberChoices.Add(new SemanticResultValue(numberString[i], numberString[i])); } gb[1].Append(new SemanticResultKey("op1", (GrammarBuilder