speech-recognition

android : speech recognition what are the technologies available

五迷三道 提交于 2019-12-06 21:41:21
I am new to the area of "voice recognition" in android. I have a requirement in my app to have "speech recognition". So i am doing my homework. I found that 1. android SDK has support for this and it used the "google voice recognition" So from what i understand weather we invoke the recogniser by an intent or we use the class SpeechRecogniser , the actual recognition is done at the google cloud server. I tried sample apps using both methods and the matching rate in both case is very low\ ( First of all is my finding right ? i didn't get right match for most of the words/ sentence i tried ).

tensorflow: transpose expects a vector of size 1. But input(1) is a vector of size 2

回眸只為那壹抹淺笑 提交于 2019-12-06 17:56:24
I want to use a trained RNN language model to do inference.So: I loaded the trained model graph in c++ using tensorflow::MetaGraphDef graph_def; TF_CHECK_OK(ReadBinaryProto(Env::Default(), path_to_graph, &graph_def)); TF_CHECK_OK(session->Create(graph_def.graph_def())); load the model parameters by: Tensor checkpointPathTensor(tensorflow::DT_STRING, tensorflow::TensorShape()); checkpointPathTensor.scalar<std::string>()() = path_to_ckpt; TF_CHECK_OK(session_->Run({{graph_def.saver_def().filename_tensor_name(), checkpointPathTensor} },{},{graph_def.saver_def().restore_op_name()},nullptr)); up

Continuously recognize everything being said on Android?

孤者浪人 提交于 2019-12-06 17:45:28
I'm working on a project that involves speech recognition on Android. And i have some questions without clear answers on this site (or any, actually). I need to do a something like a speech to text, the problem is that i need it working continuously, imagine an app running on background and writing everything it hears on a txt. I know i will need to correct a lot of "noise hearing", but it will come later.. I am using pocketsphinx-android, and tried to follow this tutorial: http://cmusphinx.sourceforge.net/wiki/tutorialandroid The problem comes when i try to do a continuous recognition,

How to detect homophone

我们两清 提交于 2019-12-06 16:15:56
I am fairly new to speech processing, but wondering how homophones are detected. I am in search for an API which gives similarity between two words on the basis of how they are pronounced. for example: "to" and "two" are highly similar in terms of how they sound with respect to say "to" and "from". You might want to try calculating the edit distance not on the original strings, but on pronunciations, like they are available in the CMU Pronouncing Dictionary at http://www.speech.cs.cmu.edu/cgi-bin/cmudict The following are used for indexing words by their English pronunciation Soundex or

Running Android Speech Recognition as Service: will not start

别来无恙 提交于 2019-12-06 16:00:12
I'm using the solution here: Android Speech Recognition as a service on Android 4.1 & 4.2 The code below gets to the onStartCommand() method, however the speech recogntion never seems to kick off, as evidenced by the fact that the onReadyForSpeech() method is never called. UPDATE: So I added and that allowed the onReadyForSpeech() to be called , BUT onError() is called with error code: 6 after the onReadyForSpeech() method is complete (this goes into a continuous loop because the start listening code is started again after onError() is called). Like Hoan Nguyen states below, error code 6 is

Android pending intent not being called within widget

萝らか妹 提交于 2019-12-06 14:59:36
问题 Like in this question (accepted answer), I'm trying to launch voice recognition from one of my app's widgets. I succesfully managed to open dialog that requests voice input with this code inside onUpdate() method of the Widget: // this intent points to activity that should handle results, doesn't work Intent activityIntent = new Intent(SoulissApp.getAppContext(), WrapperActivity.class ); //doesn't work as well //activityIntent.setComponent(new ComponentName("it.angelic.soulissclient", "it

A problem with forced alignment in speech recognition - HTK

删除回忆录丶 提交于 2019-12-06 14:15:57
问题 I have a system where a user is asked to repeat a sentence after a prompt. It uses HTK to force-align the user-spoken-sentence to the pre-defined word level label file (of the sentence) to get a time-aligned phone level file. The HMMs have been trained on a large amount of data and give very accurate time-aligned files with HVite . My problem arises when the user does not speak the exact sentence that is required to be spoken. Let me illustrate with an example: Word level label file of the

speech recognition and sound comparation witth musicg

瘦欲@ 提交于 2019-12-06 13:41:54
I'm trying to make an Android application with speech recognition, but unfortunately google doesn't support my language (MACEDONIAN) and i'm trying to compare two recording sounds. I'm using http://code.google.com/p/musicg/ to record and compare speech, and i'm stack on initialization the settings for detecting the speech .Some one can tell me how to rewrite this init function for speech detection it's very important to me.. or some other idea how to do that. this is the initialization for whistle detection // settings for detecting a whistle minFrequency = 600.0f; maxFrequency = Double.MAX

Analyzing commands in android speech recognition results

邮差的信 提交于 2019-12-06 13:17:41
i have a speech recognition app in android and i want to compare the results that i get with my own strings this is my code if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) { ArrayList<String> matches = data.getStringArrayListExtra( RecognizerIntent.EXTRA_RESULTS); for(String resultString: matches) { if(resultString.equalsIgnoreCase("go")) Toast.makeText(getBaseContext(), "go", Toast.LENGTH_SHORT); else if(resultString.equalsIgnoreCase("stop")) Toast.makeText(getBaseContext(), "stop", Toast.LENGTH_SHORT); else if(resultString.equalsIgnoreCase("back")) Toast.makeText(getBaseContext()

Spectrograms generated using Librosa don't look consistent with Kaldi?

点点圈 提交于 2019-12-06 12:59:27
I generated spectrogram of a "seven" utterance using the "egs/tidigits" code from Kaldi, using 23 bins, 20kHz sampling rate, 25ms window, and 10ms shift. Spectrogram appears as below visualized via MATLAB imagesc function: I am experimenting with using Librosa as an alternative to Kaldi. I set up my code as below using the same number of bins, sampling rate, and window length / shift as above. time_series, sample_rate = librosa.core.load("7a.wav",sr=20000) spectrogram = librosa.feature.melspectrogram(time_series, sr=20000, n_mels=23, n_fft=500, hop_length=200) log_S = librosa.core.logamplitude