speech-recognition | 易学教程

Chrome iOS webkit speech-recognition

阅读更多关于 Chrome iOS webkit speech-recognition

问题 I'm trying to implement speech recognition on Chrome on the iPad without any luck. Just to cut to the chase and remove any dependencies on my implementation of the webkitSpeechRecognition api, Glenn Shire's excellent sample code does not run on Chrome v27 on an iPad 1 running iOS 5.1.1 or Chrome v31 on an iPad3 running iOS 7.0.4, at least as far as I can tell. It fails at this line: if (!('webkitSpeechRecognition' in window)) { r.onState('upgrade'); return; } I can't figure out a workaround,

Sphinx4 ConfidenceResult and SpeechResult

阅读更多关于 Sphinx4 ConfidenceResult and SpeechResult

I'm trying to get the confidence score of a SpeechResult by doing ConfidenceResult cr = scorer.score(result) ; Where result is a SpeechResult and scorer is a ConfidenceScorer . As it turns out this isn't allowed. Is there some way around this that I'm not seeing, besides using a Result type? Yes, you can do this, although it's a little bit roundabout. A confidence result is actually a Sausage (no, not kidding, that's what it's called: SphinxDocs:Sausage . Although it's also known as a Word Confusion Network, it's sometimes referred to as a sausage because of what the graph looks like. See Fig

How to register .dlm and .ngr files generated in Dictation Resource Kit in Windows 7?

阅读更多关于 How to register .dlm and .ngr files generated in Dictation Resource Kit in Windows 7?

问题 I am using Windows Dictation Resource kit and I have geneated .dlm and .ngr files of a medical model and now how do I register these dictation topics in Windows 7, I also would like to know if is there a way to directly load them in the program? 回答1: You need to register the topics under the engine GUID key. For US English, the key is HKLM\SOFTWARE\Microsoft\Speech\Recognizers\Tokens\MS-1033-80-DESK\Models\1033\L1033\LMs\AddOn Create a REG_SZ key whose name is the dictation topic name, and

mp3 recognition using Sphinx 4

阅读更多关于 mp3 recognition using Sphinx 4

问题 Can we use mp3 files for the voice recognition process without using wav files? or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy? The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy? 回答1: Can we use mp3 files for the voice recognition process without using wav files? Not directly. To be able to recognize

Voice Recognition Commands Android

阅读更多关于 Voice Recognition Commands Android

问题 So I've searched far and wide for some sort of solution to the issue regarding removing Google's Voice Recognition UI dialog when a user wants to perform a Voice Command but have been unable to find any solution. I am trying to implement an app which displays a menu to the user and the user can either click on the options or say the options out loud which will open the new pages. So far ive been unable to implement this unless I use Googles RecognizerIntent but I dont want the dialog box to

Audio descriptor MFCC in C#

阅读更多关于 Audio descriptor MFCC in C#

问题 I'm doing primitive speech recognition and need simple descriptor for my audio signals. Now I have only FFT from my audio signal, but I don't know what should I do after that. When I tried use Hidden Markov Models with only FFT from my training signals, it gives me wrong answers. Could you tell me about any C# libraries, which help me change my FFT signal to MFCC(Mel Frequency Cepstrum Coefficients)? 回答1: I don't know such libraries for C# but I can show you my implementation of extracting 20

Android Speech to Text Api Google - notification

阅读更多关于 Android Speech to Text Api Google - notification

问题 I followed this tuto: https://jbinformatique.com/2018/02/16/android-speech-to-text-api-google-tutoriel/ It works nice ! It uses android.speech.RecognizerIntent package it's free and it works without Internet as mentionned here: Difference between Android Speech to Text API (Recognizer Intent) and Google Cloud Speech API? However when I start the speech recognition, I get the following notification : If I translate (as I can..), it says : "Your audio records will be sent to Google and used for

Microsoft Speech Recognition - numbers only

阅读更多关于 Microsoft Speech Recognition - numbers only

问题 Is there a way to limit the grammar to numbers only in either dictation mode or in constructing a custom grammar XML file? Obviously I can't enter all the numbers into the XML, but there has to be an easy way. 回答1: I know you asked this a long time ago, but I have a solution in case you still need it. Here is the file I came up with. This requires the user to speak single digits only, such as one five seven (not one fifty-seven, which will not work). You can play around with this to suit your

Using getUSerMedia AND webkitSpeechRecognition allow access x 2

阅读更多关于 Using getUSerMedia AND webkitSpeechRecognition allow access x 2

问题 I am creating a site that uses two types of audio input... getUserMedia and webkitSpeechRecognition. Both functions are working fine but Chrome is popping up it's access security pop-up twice - which makes sense. Does anyone know how to have one access permission handle both functions? Cheers SO! navigator.getUserMedia({audio:true}, gotStream, function(e) { alert('Error getting audio'); console.log(e); }); and... var recognition = new webkitSpeechRecognition(); 回答1: The only way to avoid

Error with my annyang program

阅读更多关于 Error with my annyang program

问题 I am trying to implements this annyang program: <script src="//cdnjs.cloudflare.com/ajax/libs/annyang/1.1.0/annyang.min.js"></script> <script> if (annyang) { // Let's define our first command. First the text we expect, and then the function it should call var commands = { 'show tps report': function() { $('#tpsreport').animate({bottom: '-100px'}); } }; // Add our commands to annyang annyang.addCommands(commands); // Start listening. You can call this here, or attach this call to an event,