speech-recognition | 易学教程

Continuous Speech Recognition Android - Without Gaps

阅读更多关于 Continuous Speech Recognition Android - Without Gaps

I have an activity that implements RecognitionListener . To make it continuous, every time onEndOfSpeech() I start the listener again: speech.startListening(recognizerIntent); But, it takes some time (around half a second) till it starts, so there is this half a second gap, where nothing is listening. Therefore, I miss words that were spoken in that time difference. On the other hand, when I use Google's Voice input, to dictate messages instead of the keyboard - this time gap does not exist. Meaning - there is a solution. What is it? Thanks try looking at a couple other api's.... speech demo :

Integrate Google Voice Recognition in Android app

阅读更多关于 Integrate Google Voice Recognition in Android app

问题 I want to introduce a new feature into my app: permanent voice recognition . First of all I followed these posts: Voice recognition Speech recognition in Android Offline Speech Recognition In Android (JellyBean) and more others, plus other posts from different websites. Problem: What actually I'm trying to do is to have a permanent voice recognition without displaying google's voice activity. For example: When I start the application the voice recognition should start and listen. When the

Synchronizing text and audio. Is there a NLP/speech-to-text library to do this?

阅读更多关于 Synchronizing text and audio. Is there a NLP/speech-to-text library to do this?

问题 I would like to synchronize a spoken recording against a known text. Is there a speech-to-text / natural language processing library that would facilitate this? I imagine I'd want to detect word boundaries and compute candidate matches from a dictionary. Most of the questions I've found on SO concern written language. Desired, but not required: Open Source Compatible with American English out-of-the-box Cross-platform Thoroughly documented Edit: I realize this is a very broad, even naive,

Why isn't speech recognition advancing? [closed]

阅读更多关于 Why isn't speech recognition advancing? [closed]

问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 5 years ago . What's so difficult about the subject that algorithm designers are having a hard time tackling it? Is it really that complex? I'm having a hard time grasping why this topic is so problematic. Can anyone give me an example as to why this is the case? 回答1: Because if people

Open source code for voice detection and discrimination

阅读更多关于 Open source code for voice detection and discrimination

问题 I have 15 audio tapes, one of which I believe contains an old recording of my grandmother and myself talking. A quick attempt to find the right place didn't turn it up. I don't want to listen to 20 hours of tape to find it. The location may not be at the start of one of the tapes. Most of the content seems to fall into three categories -- in order of total length, longest first: silence, speech radio, and music. I plan to convert all of the tapes to digital format, and then look again for the

Windows 10 Speech Recognition

阅读更多关于 Windows 10 Speech Recognition

I want to create a WPF application in c# for windows 10. Now, the problem that i had with previous windows versions was that i'm italian and there isn't a support for speech recognition in italian. But now there is cortana. So, how can i use cortana's speech recognition engine for my application? If i simply use new SpeechRecognitionEngine(new CultureInfo("it-IT"))); it gives me an error, 'cause there isn't the simple recongition engine, so i have to use cortana's one. Hope you understood and sorry for my bad english. Thank you for your answer. In order to use the new SpeechRecognition WinRT

Using System.Speech to convert mp3 file to text

阅读更多关于 Using System.Speech to convert mp3 file to text

问题 I'm trying to use the speech recognition in .net to recognize the speech of a podcast in an mp3 file and get the result as string. All the examples I've seen are related to using microphone but I don't want to use the microphone and provide a sample mp3 file as my audio source. Can anyone point me to any resource or post an example. EDIT - I converted the audio file to wav file and tried this code on it. But it only extracts the first 68 words. public class MyRecognizer { public string

How To Use Amazon Skill Set Without Amazon Echo Device

阅读更多关于 How To Use Amazon Skill Set Without Amazon Echo Device

问题 I am trying to integrate amazon skill kit in my website without an amazon echo unit. I want to implement voice commands on my website using the laptop/PC microphone instead of an echo unit. I have used this tutorial but I didn't find anything about how to implement it on my side. I also tried these samples available on github. But I think these also require an Amazon echo device: https://github.com/amzn/alexa-skills-kit-js I am using Windows with the development environment given below My

Creating ARPA language model file with 50,000 words

阅读更多关于 Creating ARPA language model file with 50,000 words

问题 I want to create an ARPA language model file with nearly 50,000 words. I can't generate the language model by passing my text file to the CMU Language Tool. Is any other link available where I can get a language model for these many words? 回答1: I thought I'd answer this one since it has a few votes, although based on Christina's other questions I don't think this will be a usable answer for her since a 50,000-word language model almost certainly won't have an acceptable word error rate or

How to plot MFCC in Python?

阅读更多关于 How to plot MFCC in Python?

I'm just a beginner here in signal processing. Here is my code so far on extracting MFCC feature from an audio file (.WAV): from python_speech_features import mfcc import scipy.io.wavfile as wav (rate,sig) = wav.read("AudioFile.wav") mfcc_feat = mfcc(sig,rate) print(mfcc_feat) I just wanted to plot the mfcc features to know what it looks like. from python_speech_features import mfcc import scipy.io.wavfile as wav import matplotlib.pyplot as plt (rate,sig) = wav.read("AudioFile.wav") mfcc_feat = mfcc(sig,rate) print(mfcc_feat) plt.plot(mfcc_feat) plt.show() This will plot the MFCC as colors,