speech-recognition | 易学教程

Getting Different Results via Bing Speech Recognition API (beta) for Same Audio file(.wav)

阅读更多关于 Getting Different Results via Bing Speech Recognition API (beta) for Same Audio file(.wav)

We are translating a bunch of audio files (i.e. .wav files), we are getting different results on separate systems. The only difference we are receiving is the number conversion is translated to words, however we only need numbers in translated text. For ex: we have wave file i.e. A-Hydrocort_50_mg-ml.wav Transcribed Text on System 1: A hydra court 50 milligrams per milliliter. Transcribed Text on System 2: A hydra court fifty milligrams per milliliter. We are using the same API call only these are 2 different machines and API itself gives us different response , the Request is exactly the same

SpeechRecognizer not work, COMException: Class not registered/ UWP App Windows IoT (10.0.10586) and Visual Studio 2015 Update 1

阅读更多关于 SpeechRecognizer not work, COMException: Class not registered/ UWP App Windows IoT (10.0.10586) and Visual Studio 2015 Update 1

After I have installed Windows IoT (10.0.10586) and Visual Studio 2015 with Update 1, I got COM-Exception when I use the SpeechRecognizer in Universal App on my Raspberry Pi 2 (with Windows IoT 10.0.10586). If I run the SpeechRecognizer UWP App on Windows 10 it works without any problems, the COM-Exception occurs only in Windows IoT (10.0.10586). With older version of Windows IoT and Visual Studio 2015 without Update 1 it works, too. Has anyone a solution for the problem? var speechRecognizer = new SpeechRecognizer(); var constraint = new SpeechRecognitionTopicConstraint

How to split a speech to word

阅读更多关于 How to split a speech to word

问题 I'm play with speech recognition. Is it possible to split speech to multiple words? If it's possible please recommend me library supported split a speech to words. Thanks 回答1: If you know what the speaker has said you can perform forced alignment to generate the word (or phoneme) time alignments. Toolkits such as CMU Sphinx, HTK and Kaldi can perform this. If don't know what the speaker has said you can just perform standard speech recognition and use the time information to obtain the word

How to determine position of recognized words of SpeechRecognitionEngine?

阅读更多关于 How to determine position of recognized words of SpeechRecognitionEngine?

问题 I am exploring the SpeechRecognitionEngine 's capabilities, and my end goal is to input a WAV file and a transcription of that WAV file, and to output the positions in the WAV file of the beginning (and ideally, end) of each word. I can get the engine to recognize the phrase successfully, but I can not understand how to retrieve the audio positions when the word starts, not when the recognition was hypothesized or recognized, etc. If you're curious what the point of this is, it is in

how to pass language in speech recognition on android apps?

阅读更多关于 how to pass language in speech recognition on android apps?

I've been working on speech Recognition API in android and found out that the speech results vary allot when the language settings are changed , is there a way to set it programmatically ? or is there an intent to lunch the speech language settings screen ? or what else ? note: I tried to use this intent extra: intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE, "en-US"); and Intent detailsIntent = new Intent(RecognizerIntent.ACTION_GET_LANGUAGE_DETAILS); sendOrderedBroadcast(detailsIntent, null, new LanguageDetailsChecker(), null, Activity.RESULT_OK, null, null); Yes hanifs, that

Nist Sphere format files

阅读更多关于 Nist Sphere format files

In order to read NIST sphere format files, I'm trying to install NIST SPHERE software downloaded from here , but I encountered some errors: make[2]: Entering directory `/home/ibtissem/tools/nist/src/bin' gcc -I/home/ibtissem/tools/nist/include -L/home/ibtissem/tools/nist/lib -g -g -DNARCH_linux h_add.c -lm -o h_add h_add.c:31: error: undefined reference to 'sp_verbose' h_add.c:31: error: undefined reference to 'sp_verbose' h_add.c:28: error: undefined reference to 'hs_getopt' h_add.c:42: error: undefined reference to 'sp_verbose' h_add.c:42: error: undefined reference to 'sp_get_version' h_add

How can we convert .wav file to text by using pocketsphinx?

阅读更多关于 How can we convert .wav file to text by using pocketsphinx?

问题 I installed pocketsphinx in my Linux machine correctly, and now I want to convert an audio file (.wave) to text by using pocketsphinx,how can i do that? is there any clear command and short command to do that? somthing like this command: ./src/programs/pocketsphinx_continuous -samprate 8000 -nfft 2048 -adcdev hw:1,0 -lm 2530.lm -dict 2530.dic myvoice.wav and also how can I do that with using python? Thanks in advance. 回答1: I find my answer, pocketsphinx with version 0.8 has an option that can

Raspberry Pi Asynchronous/Continuous Speech Recognition in Python

阅读更多关于 Raspberry Pi Asynchronous/Continuous Speech Recognition in Python

I want to create a speech recognition script for the Raspberry Pi in Python and need an asynchronous/continuous speech recognition library. Asynchronous means that I need endless running of the recognition until the spoken matches to an array of words without any input from a keyboard, and then display the spoken to the terminal and restart recognition. I already had a look at PocketSphinx, but after a few hours Googling, I didn't find anything about an Asynchronous recognition with that. Do you know any Library who is capable of that? You can use Pocketsphinx on Raspberry Pi. You need to

Start speech recognizer on Android using Phonegap

阅读更多关于 Start speech recognizer on Android using Phonegap

问题 Currently I'm making a Phonegap application. I want to combine augmented reality en speech input. There is a plugin for Phonegap called SpeechRecognizer, But I can't get it to work. My header: <script type="text/javascript" src="cordova-2.6.0.js"></script> <script type="text/javascript" src="SpeechRecognizer.js"></script> <script type="text/javascript" charset="utf-8"> document.addEventListener("deviceready", onDeviceReady, false); function speechOk() { alert('speech works'); } function

Microsoft Speech Recognition Custom Training

阅读更多关于 Microsoft Speech Recognition Custom Training

问题 I have been wanting to create an application using the Microsoft Speech Recognition. My application's users are expected to often say abbreviated things, such as 'LHC' for 'Large Hadron Collider' or 'CERN'. Given that exact order, my application will return You said: At age C. You said: Cern While it did work for 'CERN', it failed very badly for 'LHC'. However, if I could make my own custom training files, I could easily place the term 'LHC' somewhere in there. Then, I could make the user