speech-recognition

Voice control API - high accuracy on specific phrases [closed]

泪湿孤枕 提交于 2019-12-06 05:35:24
问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 7 years ago . I have several ideas for voice controlled apps. Unfortunately, based on what I've seen from Siri and Google Voice Actions, the

Swift - How can I convert Saved Audio file conversations to Text?

人走茶凉 提交于 2019-12-06 05:11:03
问题 I work on speech recognition. I solve the text-to-speech and speech-to-text with IOS frameworks. But now i want to convert saved audio file conversations to text. How can i solve this ? Thank you for all replies. 回答1: I have worked on same things which are working for me. I have audio file in my project bundle which. So I have written following code to convert audio to text. let audioURL = Bundle.main.url(forResource: "Song", withExtension: "mov") let recognizer = SFSpeechRecognizer(locale:

HTML5 Web Speech API not working locally

半腔热情 提交于 2019-12-06 03:58:18
问题 I am trying to make this code work and don't know why is it not working locally. I tried the same on CodePen.io and it works. <html> <head> <title>Voice API</title> </head> <body> <button onClick="func()">Click Me</button> <script> function func() { alert('Hello'); var recognition = new webkitSpeechRecognition(); recognition.continuous = true; recognition.interimResults = true; recognition.onresult = function(event) { alert(event.results[0][0].transcript); } recognition.start(); } </script> <

How to convert speech to text during call with different text colors for caller and call receiver?

笑着哭i 提交于 2019-12-06 01:51:42
问题 I want to convert speech to text during call. I also want the text to display in different colors: the call initiator's in red and the call receiver's green. During my tests, I converted speech to text during call but was unable to distinguish between the voice of the call initiator and that of the call receiver. Thanks in advance Please Help me out... 来源: https://stackoverflow.com/questions/20964359/how-to-convert-speech-to-text-during-call-with-different-text-colors-for-caller

Google voice recognizer doesn't start on Android 4.x

大憨熊 提交于 2019-12-06 00:53:39
I stumbled with this random issue... Here is my code mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(mContext); initializeRecognitionListener(); mSpeechRecognizer.setRecognitionListener(mRecognitionListener); Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass().getPackage().getName()); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "en-US"); intent.putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT

How to input and process audio files to convert to text via pyspeech or dragonfly

醉酒当歌 提交于 2019-12-05 22:52:28
I have seen the documentation of pyspeech and dragonfly, but don't know how to input an audio file to be converted into text. I have tried it with microphone via speaking to it and the speech is converted into text, but If I want to input a previously recorded audio file. Can anyone help with an example? Both PySpeech and Dragonfly are relatively thin wrappers over SAPI. Unfortunately, both of them use the shared recognizer, which doesn't support input selection. While I'm familiar with SAPI, I'm not that familiar with Python, so I haven't been able to assist anyone with moving PySpeech

Using c++ to call and use Windows Speech Recognition [closed]

假装没事ソ 提交于 2019-12-05 22:05:31
I am making an application that involves the use of windows speech recognition. I am thinking of using c++ to do this since i have some experience with this language. The way i want to use the speech recognition is so that it works internally. If i upload an audio file into my program, i want speech recognition to write this audio up as a text file, but all this should be done internally. Please provide some help with this and if i have not explained my question properly please let me know and i will try to explain again. Thanks in advance, Divs Michael Levy Windows provides speech recognition

How to connect SpeechRecognizer to RecognizerIntent with Extras

你。 提交于 2019-12-05 21:53:40
I am trying to wrap my mind around the SpeechRecognizer. I have a SpeechRecognizer with my own Recognition Listener: rec = SpeechRecognizer.createSpeechRecognizer(this); rec.setRecognitionListener(new RecognitionListener() { //Lots of overrides that work perfectly fine }); Wich works fine when I launch it by using rec.startListening(intent); But my intent happens to have some Extras: intent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true); intent.putExtra(RecognizerIntent.EXTRA_RESULTS_PENDINGINTENT, true); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "de-DE"); intent.putExtra

Web Speech API Custom Words

点点圈 提交于 2019-12-05 21:04:00
问题 I read through the W3C docs on this and I'm thinking that custom words come from custom grammar, but I tried going to this demo and in the console entered the following javascript: recognition.grammars.addFromString('foo'); Which ran fine and recognition.grammars[0].src returns: "data:application/xml,foo" Note : 'foo' is not the word I'm interested in, but the word I'm interested in isn't an english word, using 'foo' for the example. When I speak my custom word normally, it thinks I'm saying

Free-form text with custom SRGS based Grammar

末鹿安然 提交于 2019-12-05 20:53:22
I am trying to develop a Voice based application that would accept user input as speech and perform some actions based on the input. This is my first ever venture into this technology and I am learning while developing it. I am using Microsoft SAPI shipped with dotnet 4 to recognize speech. So far, I have learned about the two types of modes it supports. Speech recognition (SR) has two modes of operation: Dictation mode — an unconstrained, free-form speech interpretation mode that uses a built-in grammar provided by the recognizer for a specific language. This is the default recognizer.