speech-recognition

Webkit Speech - Javascript trigger mic listen

喜夏-厌秋 提交于 2019-12-13 05:09:23
问题 <input x-webkit-speech /> Gives in webkit browsers: Is there a way to to trigger the mic click with javascript ? 回答1: Not at this time. You can visit http://www.webkit.org/ and leave a suggestion for them to add this in the future. 回答2: Old question but this is now possible, ish. http://updates.html5rocks.com/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-API 来源: https://stackoverflow.com/questions/9360822/webkit-speech-javascript-trigger-mic-listen

Use of SpeechRecognizer produces ERROR_NETWORK (Value 2)

戏子无情 提交于 2019-12-13 04:56:59
问题 I am using the class SpeechRecognizer by calling the method startListening when a button is pressed, but I get an error. First the callback methods onReadyForSpeech, onBeginningOfSpeech, onEndofSpeech are called (immediately) and at the end onError with the errorcode 2 "ERROR_NETWORK" is called. When I call the intent directly by using startActivityForResult, it works. But I want to get rid of the time consuming popup dialog. I have the RECORD_AUDIO permission set. I am not sure, but perhaps

Google Speech Cloud error on Android: OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds

╄→尐↘猪︶ㄣ 提交于 2019-12-13 04:19:21
问题 First: I already know there is a 65 second limit on continuous speech recognition streaming with this API. My goal is NOT to extend those 65 seconds. My app: It uses Google's streaming Speech Recognition, I based my code on this example: https://github.com/GoogleCloudPlatform/android-docs-samples/tree/master/speech The app works fairly well, I get ASR results and show them onscreen as the user speaks, Siri style. The problem: My problem comes after tapping the ASR button on my app several,

Please build and install the PortAudio Python bindings first

心不动则不痛 提交于 2019-12-13 02:58:32
问题 i already installed pyaudio but the problem is when i work with the microphone functions import speech_recognition as sr r = sr.Recognizer() mic = sr.Microphone() the problem is in the third line mic = sr.Microphone() the terminal will give me this message Please build and install the PortAudio Python bindings first. and if i try to install pip install PortAudio it will give me the following message Could not find a version that satisfies the requirement PortAudio (from versions: )No matching

Converting from one MFCC type to another - HTK

北战南征 提交于 2019-12-13 02:10:21
问题 I am working with the HTK toolkit on a word spotting task and have a classic training and testing data mismatch. The training data consisted of only "clean" (recorded over a mic) data. The data was converted to MFCC_E_D_A parameters which were then modelled by HMMs (phone-level). My test data has been recorded over landline and mobile phone channels (inviting distortions and the like). Using the MFCC_E_D_A parameters with HVite results in incorrect output. I want to make use of cepstral mean

Muting SpeechRecognizer's beep sound

回眸只為那壹抹淺笑 提交于 2019-12-13 01:25:33
问题 I'm using SpeechRecognizer API for my app, and everytime it starts, it plays "beep" sound. I'd like to know how to mute it, So I could implement one of my own. Thanks. 回答1: If you are using a button to activate and deactivate the recognizer you can mute sound onclick. This doesnt work fantastically if you have it listening constantly, however for button clicks it should be fine :) private AudioManager manager; manager = (AudioManager) getSystemService(Context.AUDIO_SERVICE); if (isChecked) {

Speech recogition and intonation detection

好久不见. 提交于 2019-12-13 00:57:35
问题 I want to make an iOS app to count interrogative sentences. I will look for WH questions and also "will I, am I?" format questions. I am not very get in the speech or audio technology world, but I did Google and found that there are few speech recognition SDKs. But still no idea how can I detect and graph intonation. Are there any SDKs that support intonation or emotional speech recognition? 来源: https://stackoverflow.com/questions/15527107/speech-recogition-and-intonation-detection

sphinx-4 aligner skips plain words like `you`, `in` and words with dashes - why?

安稳与你 提交于 2019-12-13 00:52:42
问题 I'm trying to align simple text. Here are the links to text and audio files: http://s000.tinyupload.com/?file_id=48044768133759453374 http://s000.tinyupload.com/?file_id=99891199139563396901 Here is the configuration settings: private static final String ACOUSTIC_MODEL_PATH = "resource:/edu/cmu/sphinx/models/en-us/en-us"; private static final String DICTIONARY_PATH = "resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict"; The output I get is the following (ellipsis are added by me): - ï -

measuring rate of speech in realtime

落花浮王杯 提交于 2019-12-12 22:35:47
问题 I'm looking for a quick and simple way to measure the rate at which I am speaking in real time. Course grained approaches or approximations are sufficient. The idea is to write a simple app/widget that at least tells you to speed up or slow down while speaking. Measuring things like pitch and volume might also be nice. I assume this can be done simply with a variety of speech recognition libraries, but I am familiar with none of them and quick glances at the documentation do not give a simple