speech-recognition | 易学教程

how to add a custom lexicon in my c # project

阅读更多关于 how to add a custom lexicon in my c # project

问题 I m developing an c# project based on voice recognition. I want to recognize words in Indian English accent so for that i thought for lexicon & then adding pronunciations in that file but I m not getting how to add a lexicon in my project & how to create a lexicon? 回答1: Lexicons aren't exposed via System.Speech.Recognition, unfortunately. You can access lexicons using the SpeechLib automation interface to SAPI, though; the object you want to create is SpLexicon. Note that System.Speech

Got an error during UBM speaker-adaptation with sidekit

阅读更多关于 Got an error during UBM speaker-adaptation with sidekit

问题 I've already trained a UBM model and now I'm trying to implement the speaker-adaptation when I got following error. Exception: show enroll/something.wav is not in the HDF5 file I got two files "enroll" and "test" under the file "feat" which contains respectively features(.h5) for training and test, and my enroll_idmap is generated with the audios(.wav) only for training. And, my wav files and feat files are separated. I think I got a problem of idmap. "enroll/something.wav" is the rightid of

I can't get kinect sdk to do speech recognition and track skeletal data at the sime time

阅读更多关于 I can't get kinect sdk to do speech recognition and track skeletal data at the sime time

问题 I' ve a program in witch I enabled speech recognition with.. RecognizerInfo ri = GetKinectRecognizer(); speechRecognitionEngine = new SpeechRecognitionEngine(ri.Id); // Create a grammar from grammar definition XML file. using (var memoryStream = new MemoryStream(Encoding.ASCII.GetBytes(fileContent))) { var g = new Grammar(memoryStream); speechRecognitionEngine.LoadGrammar(g); } speechRecognitionEngine.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(speechEngine

How do you efficiently create a grammar file for speech recognition given a large list of words?

阅读更多关于 How do you efficiently create a grammar file for speech recognition given a large list of words?

问题 It's easy to write a grammar file for speech recognition from only 50 words because you can just do it manually. What is the easiest, most efficient way to do it if you have 10,000 or 100,000 words? Example: Say we have "RC cola" and "Pepsi cola". We would have grammar file consisting of 2 rules: DRINK: (COLANAME ?[coke cola soda]) COLANAME: [rc pepsi] It will recognizes "RC","RC Coke","RC Cola","RC Soda", "Pepsi", "Pepsi Coke", "Pepsi Cola" and "Pepsi Soda". Edit: I'm talking about grammar

How to get alternate single words during dictation in SAPI 5.4 using C#?

阅读更多关于 How to get alternate single words during dictation in SAPI 5.4 using C#?

问题 I am running a user study with speech recognition and new technologies. During the laboratory tests, I need to display all the dictated text using an interface that I programmed. Currently, I can get the alternate whole sentences in C# but I need to get the single words. For example, if someone says "Hello, my name is Andrew", I want to get an alternate word for "Hello", "my", "name", "is" and "Andrew", instead of an alternate for the complete sentence. Here is a code snippet of the handler I

Sphinx4 ConfidenceResult and SpeechResult

阅读更多关于 Sphinx4 ConfidenceResult and SpeechResult

问题 I'm trying to get the confidence score of a SpeechResult by doing ConfidenceResult cr = scorer.score(result) ; Where result is a SpeechResult and scorer is a ConfidenceScorer . As it turns out this isn't allowed. Is there some way around this that I'm not seeing, besides using a Result type? 回答1: Yes, you can do this, although it's a little bit roundabout. A confidence result is actually a Sausage (no, not kidding, that's what it's called: SphinxDocs:Sausage. Although it's also known as a

android : speech recognition what are the technologies available

阅读更多关于 android : speech recognition what are the technologies available

问题 I am new to the area of "voice recognition" in android. I have a requirement in my app to have "speech recognition". So i am doing my homework. I found that 1. android SDK has support for this and it used the "google voice recognition" So from what i understand weather we invoke the recogniser by an intent or we use the class SpeechRecogniser , the actual recognition is done at the google cloud server. I tried sample apps using both methods and the matching rate in both case is very low\ (

Continuously recognize everything being said on Android?

阅读更多关于 Continuously recognize everything being said on Android?

问题 I'm working on a project that involves speech recognition on Android. And i have some questions without clear answers on this site (or any, actually). I need to do a something like a speech to text, the problem is that i need it working continuously, imagine an app running on background and writing everything it hears on a txt. I know i will need to correct a lot of "noise hearing", but it will come later.. I am using pocketsphinx-android, and tried to follow this tutorial: http://cmusphinx

SAPI Speech recognition delphi

阅读更多关于 SAPI Speech recognition delphi

问题 I to need create a programmatic equivalent using delphi language... or could someone post a link on how to do grammars in peech recogniton using the delphi. Or any examples of XML grammar that has programmatic equivalent in Delphi. sorry for my english. **Programmatic Equivalent ** Ref: http://msdn.microsoft.com/en-us/library/ms723634(v=VS.85).aspx SPSTATEHANDLE hsHelloWorld; hr = cpRecoGrammar->GetRule(L"HelloWorld", NULL, SPRAF_TopLevel | SPRAF_Active, TRUE, &hsHelloWorld); hr =

How to use tf.nn.ctc_loss in cnn+ctc network

阅读更多关于 How to use tf.nn.ctc_loss in cnn+ctc network

问题 Recently, I try to use tensorflow to implement a cnn+ctc network base on the article Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks. I try to feed batch spectrogram data (shape:(10,120,155,3),batch_size is 10) into 10 convolution layer and 3 fully connected layer. So the output before connecting the ctc layer is 2d data(shape:(10,1024)). Here is my problem: I want to use tf.nn.ctc_loss function in tensorflow library,but it generate the ValueError: Dimension must