voice-recognition | 易学教程

How can a Chrome extension get a user's permission to use user's computer's microphone?

阅读更多关于 How can a Chrome extension get a user's permission to use user's computer's microphone?

问题 If we run HTML5's Web Speech API's JavaScript codes below in a website on a Chrome, Chrome will ask for user's permission for the use of user's computer's microphone. var recognition = new webkitSpeechRecognition(); recognition.start(); But if I run codes above on a Chrome extension's page, Chrome doesn't ask users to give a permission. How can a Chrome extension get a user's permission to use user's computer's microphone? Thank you. 回答1: I think you have to implement it yourself. In chrome

“Speaker Recognition and Segmentation” [closed]

阅读更多关于 “Speaker Recognition and Segmentation” [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . Looking for a code that would process media file to "Who said what and when" in other words a "Speaker by speaker Segmentation" and what timing for each. Failing answers: doing any manual works to process the media file..thanks! 回答1: You can use speaker diarization from Kaldi, it is not easy to setup but results

Add generic placeholders to SRGS Grammar

阅读更多关于 Add generic placeholders to SRGS Grammar

问题 I trying to add speech recognition into my app. But unfortunately I don't find a way to add generic placeholders. For example I want to say "I am looking for stackoverflow" then I use this grammar: <grammar root="start" {...}> <rule id="start" scope="public"> I am <one-of> <item> looking for</item> <item> listening to</item> </one-of> </rule>  </grammar> My problem is, how to implement the search for "stackoverflow". If I use this grammar the recognizer always

SFSpeechRecognizer multiple languages

阅读更多关于 SFSpeechRecognizer multiple languages

问题 I am building a search that supports voice recognition and transforms speech to text so I am using SFSpeechRecognizer . But the problem is that I need to support multiple languages at the same time such as (" en_US ", " fr ", vi , ar ). The main idea is that the user can speak for example 1 word in English and the other in French and I want the engine to detect this. Currently, I am using this to set my main language ENGLISH: private let speechRecognizer = SFSpeechRecognizer(locale: Locale

Call recording using VOICE_UPLINK and VOICE_DOWNLINK

阅读更多关于 Call recording using VOICE_UPLINK and VOICE_DOWNLINK

问题 Any one successfully recorded call using AudioSource VOICE_UPLINK and VOICE_DOWNLINK. I am able to record the call using those sources but recorded voice is not clear I mean it hears like stretched voice. One more question:- What is the use of audio source VOICE_CALL when it never works? thank you. 回答1: Actually, VOICE_CALL works, at least for me on Sony Xperia Z1. I use AudioRecord to save bytes in some temporary file and then convert the temporary file to a WAV-file. I had the same problem

Call recording using VOICE_UPLINK and VOICE_DOWNLINK

阅读更多关于 Call recording using VOICE_UPLINK and VOICE_DOWNLINK

Best way to do voice authentication in C#

阅读更多关于 Best way to do voice authentication in C#

问题 I am building a voice authentication system and for that, I am using C# Speech recognition which lets me save the audio file which I convert and stores it as wav file. I have another wav file in which I have stored my voice. Then I am using FFT as mentioned here to compare 2 wav file and I use Cross Correlation code from here. My openWav code is as below: public static void openWav(string filename, out double[] left, out double[] right) { var numArray = File.ReadAllBytes(filename); int num1 =

Speech to text and Text to speech at same time

阅读更多关于 Speech to text and Text to speech at same time

问题 INTRODUCTION I'm developing an app where I need to use booth SpeechRecognizer and TTS. But I'm facing some problems while trying this. The main one is that if I initialize TTS, SpeechRecgonizer seems not to work, and If I disable TTS, then SpeechRecognizer works fine. Next there is code snipet with the relevant code: CODE public class GameActivity extends Activity implements OnInitListener { private static TextToSpeech tts; @Override public void onCreate(Bundle savedInstanceState) { super

TTS and Speech Input simultaneously?

阅读更多关于 TTS *and* Speech Input simultaneously?

问题 I noticed that as soon as a voice recognition activity starts, text-to-speech output stops. I understand the rational: TTS output could be "heard" by the voice recognition engine and interfere with its proper operation. My question: Is this behavior hard-coded into the system, or can it be modified by a setting or parameter (in the API)? 回答1: Must the activity simultaneously use recognition and TTS? If the recognition can wait (functionally speaking), force the event to spawn the

How to execute both Speech Recognition and Audio recording at the same time?

阅读更多关于 How to execute both Speech Recognition and Audio recording at the same time?

问题 THIS IS NOT A DUPLICATE, HERE I TALK ABOUT STOPPING AND STARTING THE RECORD PROCESS WHENEVER I WANT. BEFORE MARKING AS DUPLICATES, PLEASE READ THE OTHER ANSWER PROPERLY. I am developing a Phonegap plugin for Android . This plugin will basically support the Android Speech Recognition and recording the speech. I am capable of starting, stopping etc the speech recognition, but got serious issues with recording. First, the code is posted below. Below is how I start the speech, end it etc. public