speech-recognition | 易学教程

android : speech recognition what are the technologies available

阅读更多关于 android : speech recognition what are the technologies available

I am new to the area of "voice recognition" in android. I have a requirement in my app to have "speech recognition". So i am doing my homework. I found that 1. android SDK has support for this and it used the "google voice recognition" So from what i understand weather we invoke the recogniser by an intent or we use the class SpeechRecogniser , the actual recognition is done at the google cloud server. I tried sample apps using both methods and the matching rate in both case is very low\ ( First of all is my finding right ? i didn't get right match for most of the words/ sentence i tried ).

tensorflow: transpose expects a vector of size 1. But input(1) is a vector of size 2

阅读更多关于 tensorflow: transpose expects a vector of size 1. But input(1) is a vector of size 2

I want to use a trained RNN language model to do inference.So: I loaded the trained model graph in c++ using tensorflow::MetaGraphDef graph_def; TF_CHECK_OK(ReadBinaryProto(Env::Default(), path_to_graph, &graph_def)); TF_CHECK_OK(session->Create(graph_def.graph_def())); load the model parameters by: Tensor checkpointPathTensor(tensorflow::DT_STRING, tensorflow::TensorShape()); checkpointPathTensor.scalar<std::string>()() = path_to_ckpt; TF_CHECK_OK(session_->Run({{graph_def.saver_def().filename_tensor_name(), checkpointPathTensor} },{},{graph_def.saver_def().restore_op_name()},nullptr)); up

Continuously recognize everything being said on Android?

阅读更多关于 Continuously recognize everything being said on Android?

I'm working on a project that involves speech recognition on Android. And i have some questions without clear answers on this site (or any, actually). I need to do a something like a speech to text, the problem is that i need it working continuously, imagine an app running on background and writing everything it hears on a txt. I know i will need to correct a lot of "noise hearing", but it will come later.. I am using pocketsphinx-android, and tried to follow this tutorial: http://cmusphinx.sourceforge.net/wiki/tutorialandroid The problem comes when i try to do a continuous recognition,

How to detect homophone

阅读更多关于 How to detect homophone

I am fairly new to speech processing, but wondering how homophones are detected. I am in search for an API which gives similarity between two words on the basis of how they are pronounced. for example: "to" and "two" are highly similar in terms of how they sound with respect to say "to" and "from". You might want to try calculating the edit distance not on the original strings, but on pronunciations, like they are available in the CMU Pronouncing Dictionary at http://www.speech.cs.cmu.edu/cgi-bin/cmudict The following are used for indexing words by their English pronunciation Soundex or

Running Android Speech Recognition as Service: will not start

阅读更多关于 Running Android Speech Recognition as Service: will not start

I'm using the solution here: Android Speech Recognition as a service on Android 4.1 & 4.2 The code below gets to the onStartCommand() method, however the speech recogntion never seems to kick off, as evidenced by the fact that the onReadyForSpeech() method is never called. UPDATE: So I added and that allowed the onReadyForSpeech() to be called , BUT onError() is called with error code: 6 after the onReadyForSpeech() method is complete (this goes into a continuous loop because the start listening code is started again after onError() is called). Like Hoan Nguyen states below, error code 6 is

Android pending intent not being called within widget

阅读更多关于 Android pending intent not being called within widget

问题 Like in this question (accepted answer), I'm trying to launch voice recognition from one of my app's widgets. I succesfully managed to open dialog that requests voice input with this code inside onUpdate() method of the Widget: // this intent points to activity that should handle results, doesn't work Intent activityIntent = new Intent(SoulissApp.getAppContext(), WrapperActivity.class ); //doesn't work as well //activityIntent.setComponent(new ComponentName("it.angelic.soulissclient", "it

A problem with forced alignment in speech recognition - HTK

阅读更多关于 A problem with forced alignment in speech recognition - HTK

问题 I have a system where a user is asked to repeat a sentence after a prompt. It uses HTK to force-align the user-spoken-sentence to the pre-defined word level label file (of the sentence) to get a time-aligned phone level file. The HMMs have been trained on a large amount of data and give very accurate time-aligned files with HVite . My problem arises when the user does not speak the exact sentence that is required to be spoken. Let me illustrate with an example: Word level label file of the

speech recognition and sound comparation witth musicg

阅读更多关于 speech recognition and sound comparation witth musicg

I'm trying to make an Android application with speech recognition, but unfortunately google doesn't support my language (MACEDONIAN) and i'm trying to compare two recording sounds. I'm using http://code.google.com/p/musicg/ to record and compare speech, and i'm stack on initialization the settings for detecting the speech .Some one can tell me how to rewrite this init function for speech detection it's very important to me.. or some other idea how to do that. this is the initialization for whistle detection // settings for detecting a whistle minFrequency = 600.0f; maxFrequency = Double.MAX

Analyzing commands in android speech recognition results

阅读更多关于 Analyzing commands in android speech recognition results

i have a speech recognition app in android and i want to compare the results that i get with my own strings this is my code if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) { ArrayList<String> matches = data.getStringArrayListExtra( RecognizerIntent.EXTRA_RESULTS); for(String resultString: matches) { if(resultString.equalsIgnoreCase("go")) Toast.makeText(getBaseContext(), "go", Toast.LENGTH_SHORT); else if(resultString.equalsIgnoreCase("stop")) Toast.makeText(getBaseContext(), "stop", Toast.LENGTH_SHORT); else if(resultString.equalsIgnoreCase("back")) Toast.makeText(getBaseContext()

Spectrograms generated using Librosa don't look consistent with Kaldi?

阅读更多关于 Spectrograms generated using Librosa don't look consistent with Kaldi?

I generated spectrogram of a "seven" utterance using the "egs/tidigits" code from Kaldi, using 23 bins, 20kHz sampling rate, 25ms window, and 10ms shift. Spectrogram appears as below visualized via MATLAB imagesc function: I am experimenting with using Librosa as an alternative to Kaldi. I set up my code as below using the same number of bins, sampling rate, and window length / shift as above. time_series, sample_rate = librosa.core.load("7a.wav",sr=20000) spectrogram = librosa.feature.melspectrogram(time_series, sr=20000, n_mels=23, n_fft=500, hop_length=200) log_S = librosa.core.logamplitude