voice-recognition | 易学教程

How to compare two audio data?

阅读更多关于 How to compare two audio data?

问题 I will record my own voice and save them as wav files in my computer. Later on I will speak and computer should match my voice command with preexisting/pre-recorded wav files.. Question: How to check two audio data are equal or there is 80%match between two audio? if(audio1 == audio2) DO Task A else if( audio1 is a bit similar to audio 2) DO TASK B else if( audio1 (80% match) audio 2) DO TASK C end if What is the best way to compare two audio data? 回答1: Unfortunately you won't get anywhere

ALSA lib pcm_hw.c:1667:(_snd_pcm_hw_open) Invalid value for card arecord: main:722: audio open error: No such file or directory

阅读更多关于 ALSA lib pcm_hw.c:1667:(_snd_pcm_hw_open) Invalid value for card arecord: main:722: audio open error: No such file or directory

问题 i am working on speech recognition . for this i am using "alsa-utils" but when i try to use this script #!/bin/bash echo “Recording… Press Ctrl+C to Stop.” arecord -D plughw:1,0 -q -f cd -t wav | ffmpeg -loglevel panic -y -i – -ar 16000 -acodec flac file.flac > /dev/null 2>&1 echo “Processing…” wget -q -U “Mozilla/5.0” –post-file file.flac –header “Content-Type: audio/x-flac; rate=16000” -O – “http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium” | cut -d” -f12 >stt.txt

ALSA lib pcm_hw.c:1667:(_snd_pcm_hw_open) Invalid value for card arecord: main:722: audio open error: No such file or directory

阅读更多关于 ALSA lib pcm_hw.c:1667:(_snd_pcm_hw_open) Invalid value for card arecord: main:722: audio open error: No such file or directory

Human face, emotion and voice recognition

阅读更多关于 Human face, emotion and voice recognition

问题 I am looking for a good face, emotion and voice recognition method in C# . For face recognition I was early using Emgu CV which is not accurate and performance is very low in low light conditions. Also I need to find user's emotion. Whether sad or happy like that. But I found its not easy with Emgu CV. Also for voice recognition I am not able to find any solutions yet, I found speech recognition but it is not what I need. I don't want to use any online API's. Can anybody suggest me any SDKs

Human face, emotion and voice recognition

阅读更多关于 Human face, emotion and voice recognition

how to build BufferReceived() to capture voice using RecognizerIntent?

阅读更多关于 how to build BufferReceived() to capture voice using RecognizerIntent?

问题 i am working on an android application using RecognizerIntent.ACTION_RECOGNIZE_SPEECH,,, my problem is that i don't know how to create the buffer which will capture the voice that the user inputs. i read alot on stack overflow, but i just don't understand how i will include the buffer and the recognition service call back into my code. AND HOW WILL I DO PLAY BACK FOR THE CONTENTS WHICH WERE SAVED INTO THE BUFFER. this is my code: public class Voice extends Activity implements OnClickListener

how to build BufferReceived() to capture voice using RecognizerIntent?

阅读更多关于 how to build BufferReceived() to capture voice using RecognizerIntent?

Convert GMM-UBM scores to equicalent accuracy percent

阅读更多关于 Convert GMM-UBM scores to equicalent accuracy percent

问题 I have constructed a GMM-UBM model for the speaker recognition purpose. The output of models adapted for each speaker some scores calculated by log likelihood ratio. Now I want to convert these likelihood scores to equivalent number between 0 and 100. Can anybody guide me please? 回答1: There is no straightforward formula. You can do simple things like prob = exp(logratio_score) but those might not reflect the true distribution of your data. The computed probability percentage of your samples

Error in MSR Identity toolkit (fopen)

阅读更多关于 Error in MSR Identity toolkit (fopen)

问题 I try to run a demo for speaker verification using MSR Identity toolkit. However it left error after training UBM step. The error is as follow. It looks like fopen return -1 and cause error to fread . I can't understand why it can't read the filenames . I can't attach the code since it involves many functions. I just hope someone that familiar with this toolkit can help me. Error using fread Invalid file identifier. Use fopen to generate a valid file identifier. Error in htkread (line 7)

Audio Signal when Voice Search Dialog is Ready to Accept Input?

阅读更多关于 Audio Signal when Voice Search Dialog is Ready to Accept Input?

问题 The Google Voice Search comes with a significant delay from the moment you call it via startActivityForResult() until its dialog box is displayed, ready to take your speech. This requires the user to always look at the screen, waiting for the dialog box to be displayed, before speaking. It would be nice to add a 'ding' sound or some other non-visual cue to when Voice Search is ready to accept speech input. Is this possible at all? If so, how do go about doing that? 回答1: Ok this will