speech-recognition

Chrome iOS webkit speech-recognition

旧巷老猫 提交于 2019-12-08 16:03:39
问题 I'm trying to implement speech recognition on Chrome on the iPad without any luck. Just to cut to the chase and remove any dependencies on my implementation of the webkitSpeechRecognition api, Glenn Shire's excellent sample code does not run on Chrome v27 on an iPad 1 running iOS 5.1.1 or Chrome v31 on an iPad3 running iOS 7.0.4, at least as far as I can tell. It fails at this line: if (!('webkitSpeechRecognition' in window)) { r.onState('upgrade'); return; } I can't figure out a workaround,

Sphinx4 ConfidenceResult and SpeechResult

浪子不回头ぞ 提交于 2019-12-08 13:14:24
I'm trying to get the confidence score of a SpeechResult by doing ConfidenceResult cr = scorer.score(result) ; Where result is a SpeechResult and scorer is a ConfidenceScorer . As it turns out this isn't allowed. Is there some way around this that I'm not seeing, besides using a Result type? Yes, you can do this, although it's a little bit roundabout. A confidence result is actually a Sausage (no, not kidding, that's what it's called: SphinxDocs:Sausage . Although it's also known as a Word Confusion Network, it's sometimes referred to as a sausage because of what the graph looks like. See Fig

How to register .dlm and .ngr files generated in Dictation Resource Kit in Windows 7?

三世轮回 提交于 2019-12-08 12:26:47
问题 I am using Windows Dictation Resource kit and I have geneated .dlm and .ngr files of a medical model and now how do I register these dictation topics in Windows 7, I also would like to know if is there a way to directly load them in the program? 回答1: You need to register the topics under the engine GUID key. For US English, the key is HKLM\SOFTWARE\Microsoft\Speech\Recognizers\Tokens\MS-1033-80-DESK\Models\1033\L1033\LMs\AddOn Create a REG_SZ key whose name is the dictation topic name, and

mp3 recognition using Sphinx 4

前提是你 提交于 2019-12-08 12:13:23
问题 Can we use mp3 files for the voice recognition process without using wav files? or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy? The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy? 回答1: Can we use mp3 files for the voice recognition process without using wav files? Not directly. To be able to recognize

Voice Recognition Commands Android

三世轮回 提交于 2019-12-08 09:18:09
问题 So I've searched far and wide for some sort of solution to the issue regarding removing Google's Voice Recognition UI dialog when a user wants to perform a Voice Command but have been unable to find any solution. I am trying to implement an app which displays a menu to the user and the user can either click on the options or say the options out loud which will open the new pages. So far ive been unable to implement this unless I use Googles RecognizerIntent but I dont want the dialog box to

Audio descriptor MFCC in C#

99封情书 提交于 2019-12-08 09:07:15
问题 I'm doing primitive speech recognition and need simple descriptor for my audio signals. Now I have only FFT from my audio signal, but I don't know what should I do after that. When I tried use Hidden Markov Models with only FFT from my training signals, it gives me wrong answers. Could you tell me about any C# libraries, which help me change my FFT signal to MFCC(Mel Frequency Cepstrum Coefficients)? 回答1: I don't know such libraries for C# but I can show you my implementation of extracting 20

Android Speech to Text Api Google - notification

爷,独闯天下 提交于 2019-12-08 08:48:54
问题 I followed this tuto: https://jbinformatique.com/2018/02/16/android-speech-to-text-api-google-tutoriel/ It works nice ! It uses android.speech.RecognizerIntent package it's free and it works without Internet as mentionned here: Difference between Android Speech to Text API (Recognizer Intent) and Google Cloud Speech API? However when I start the speech recognition, I get the following notification : If I translate (as I can..), it says : "Your audio records will be sent to Google and used for

Microsoft Speech Recognition - numbers only

和自甴很熟 提交于 2019-12-08 08:35:29
问题 Is there a way to limit the grammar to numbers only in either dictation mode or in constructing a custom grammar XML file? Obviously I can't enter all the numbers into the XML, but there has to be an easy way. 回答1: I know you asked this a long time ago, but I have a solution in case you still need it. Here is the file I came up with. This requires the user to speak single digits only, such as one five seven (not one fifty-seven, which will not work). You can play around with this to suit your

Using getUSerMedia AND webkitSpeechRecognition allow access x 2

懵懂的女人 提交于 2019-12-08 08:28:25
问题 I am creating a site that uses two types of audio input... getUserMedia and webkitSpeechRecognition. Both functions are working fine but Chrome is popping up it's access security pop-up twice - which makes sense. Does anyone know how to have one access permission handle both functions? Cheers SO! navigator.getUserMedia({audio:true}, gotStream, function(e) { alert('Error getting audio'); console.log(e); }); and... var recognition = new webkitSpeechRecognition(); 回答1: The only way to avoid

Error with my annyang program

∥☆過路亽.° 提交于 2019-12-08 08:26:56
问题 I am trying to implements this annyang program: <script src="//cdnjs.cloudflare.com/ajax/libs/annyang/1.1.0/annyang.min.js"></script> <script> if (annyang) { // Let's define our first command. First the text we expect, and then the function it should call var commands = { 'show tps report': function() { $('#tpsreport').animate({bottom: '-100px'}); } }; // Add our commands to annyang annyang.addCommands(commands); // Start listening. You can call this here, or attach this call to an event,