speech-recognition

Speech to Text: Play MP3 message by itself and retrieve words

房东的猫 提交于 2019-12-01 01:57:46
I have few MP3 files which are speeches. I have used Android Speech to Text before so I know it can store spoken words. Is there any way where we can get the spoken words from the MP3 and display it in a EditText ? I am thinking about playing the MP3 silently and identify the words, but have no idea about how to do that. I am using Google Speech Engine. There is no native way to convert an audio file that contains spoken words to text on Android. You'll need to use a third-party API to do this, such as. A&T Nuance iSpeech And perhaps Pocket Sphinx , although you may have to write the file

C# Speech Recognition

▼魔方 西西 提交于 2019-12-01 01:20:39
问题 I am making a Smart House Control System right now, and I have a little problem. I was thinking on using Cosmos for a base system, and adding the needed namespace libraries to it, but as the usual System.Speech.Recognition namespace depends too much on Windows Speech API, I have to forget about using it. So my question is, is there any (free if possible) voice recognition and/or speech speech synthesizer library for C#, what has the following: support for multi-language speaking extracting

Get user input from Speech?

南楼画角 提交于 2019-12-01 01:20:11
问题 I have just started trying out the Windows Speech to Text capabilities in C# .Net. I currently have the basics working (IE - Say something, and it will provide output based on what you say). However, I am struggling to figure out how to actually recieve user input as a variable. What I mean by this, is that for example. If the user says: "Call me John" Then I want to be able to take the word John as a variable and then store that as say, the persons username. My current SpeechRecognized event

Speech to text sdk freezes after video playback

女生的网名这么多〃 提交于 2019-12-01 01:17:31
I'm using speech-to-text sdk provided by https://github.com/todoroo/iPhone-Speech-To-Text Recognizer works just fine until the moment I playback a video using MPMoviePlayerController. Here is the code i'm using to call recognizer: - (IBAction)actionBtRecognition:(id)sender { if(recognizer == nil){ recognizer = [[SpeechToTextModule alloc] init]; } [recognizer beginRecording]; } To playback movie I used this tutorial So, once I playback a movie and call recognizer, it's just freezes. When I debug sdk source code, I was found that my voice is not recording. Variable meterStateDB.mAveragePower is

How can I improve Watson Speech to Text accuracy?

心已入冬 提交于 2019-12-01 00:36:30
I understand that Watson Speech To Text is somewhat calibrated for colloquial conversation and for 1 or 2 speakers. I also know that it can deal with FLAC better than WAV and OGG. I would like to know how could I improve the algorithm recognition, acoustically speaking. I mean, does increasing volume help? Maybe using some compression filter? Noise reduction? What kind of pre processing could help for this service? the best way to improve the accuracy of the base models (which are very accurate but also very general) is by using the Watson STT customization service: https://www.ibm.com/watson

UWP suitable project solution

半世苍凉 提交于 2019-12-01 00:22:08
I want rewrite C# Winforms desktop application to Universal Windows Platform , but first of all I'm trying to figure out, what should be suitable for my goal. The reason why I want to use UWP is a quality of Speech Recognition, (maybe also work on other devices, but this is currently very secondary). Here is my previous question Speech recognition for windows desktop application where advised to use UWP with Speech recognition Quickstart: Recognize speech with the Speech SDK for .NET Framework I developing non-commercial, personal, and actively used application, which requires better quality

SpeechRecognizer - time limit

元气小坏坏 提交于 2019-11-30 23:53:37
I am using SppechRecognizer for voice recognizer application. Its working fine. My requirement is i want to stop the voice listening after 1 sec or 2 sec. How to achieve that? 1 or 2 seconds doesn't seem to be a lot of time but if you want to set a time limit, you'd probably have to thread it. Android has some default extras to set the minimum length of speech input and maximum amount after a user has stopped speaking, but none to set the maximum length of time for speech input. Your best bet would be to thread some sort of timer, something like a CountDownTimer : yourSpeechListener

Keyword or keyphrase spotting with Sphinx4

廉价感情. 提交于 2019-11-30 23:53:29
I am currently trying to make my java code (using eclipse) perform some function if a certain thing is said. I am using the Sphinx4 libraries and this is what I currently have: What I would like it to do is at the line where it says: IF (TRUE) someFunction(); is to run the function if my speech is Hello Computer, Hello Jarvis, Good Morning Computer, or Good Morning Jarvis. Or in other words, run the function if the speech matches the "public < greet >" line of code in the .gram file. Even more specific, return "greet" if my speech corresponds with that grammar rule. I am sorry if this doesnt

'SAPI does not implement phonetic alphabet selection' exception

你离开我真会死。 提交于 2019-11-30 23:40:29
Whenever I attempt to code any speech recognition program in my laptop,I always get the same messages as mentioned below..I can always compile my code and get the windows form application working..But the problem is, the program will not detect my voice..And the program wont work.. I am very sure my codes works fine as I usually take the codes from youtube videos like : https://www.youtube.com/watch?v=KR0-UYUGYgA and many more.. I am using .NET framework 4 client profile for my projects.. I make reference only to "system.speech"...What might be my problem ? Debug messages that I get : speaker

How to train a user who is using my code which implements system.speech and SpeechRecognitionEngine

巧了我就是萌 提交于 2019-11-30 22:00:17
I have already coded using the System.Speech.Recognition namespace and use a XML SRGS file for grammer and the SpeechRecognitionEngine. I want to be able to lead the user through a training of the words or phrases that are important for the app I have written. I have just seen and read this How to train SAPI I understand that this example uses the unmanaged API (this api exposes a little more) but is exactly the same as far as the engine is concerned. So if I now set up a form and follow the instruction from the link to initiate training. Can i have my own text on the form and ask the user to