speech

Can C# SAPI speak SSML string?

吃可爱长大的小学妹 提交于 2020-01-13 18:01:15
问题 I implemented a TTS in my C# WPF project. Previously, I use the TTS in System.Speech.Synthesis namespace to speak. The speaking content is in SSML format (Speech Synthesizer Markup Language, support customize the speaking rate, voice, emphasize) like following: <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><prosody rate="x-fast">hello world. This is a long sentence speaking very fast!</prosody></speak> But unfortunately the System.Speech.Synthesis TTS has a

Microsoft Speech Recognition: Alternate results with confidence score?

倾然丶 夕夏残阳落幕 提交于 2020-01-13 11:35:09
问题 I'm new to working with the Microsoft.Speech recognizer (using Microsoft Speech Platform SDK Version 11) and I'm trying to have it output the n-best recognition matches from a simple grammar, along with the confidence score for each. According to the documentation (and as mentioned in the answer to this question), one should be able to use e.Result.Alternates to access the recognized words other than the top-scoring one. However, even after resetting the confidence rejection threshold to 0

Continuous Speech Recognition Android - Without Gaps

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-12 05:00:09
问题 I have an activity that implements RecognitionListener . To make it continuous, every time onEndOfSpeech() I start the listener again: speech.startListening(recognizerIntent); But, it takes some time (around half a second) till it starts, so there is this half a second gap, where nothing is listening. Therefore, I miss words that were spoken in that time difference. On the other hand, when I use Google's Voice input, to dictate messages instead of the keyboard - this time gap does not exist.

Google Speech Recognition API

你离开我真会死。 提交于 2020-01-10 09:39:50
问题 I'm trying to use the Google Speech API v2 (at address https://www.google.com/speech-api/v2/recognize?... ) I need to use my Api Key, but when I use it I get error 403 Forbidden When I use an API key that was on the example project I downloaded it is working fine. I saw that at the Google Developers Console I can enable a lot of api options, but didn't find any Speech-API option. Is there anything else I need to enable to get access to this API using my key? Thank you! 回答1: Instructions are

Google Speech Recognition API

我是研究僧i 提交于 2020-01-10 09:39:31
问题 I'm trying to use the Google Speech API v2 (at address https://www.google.com/speech-api/v2/recognize?... ) I need to use my Api Key, but when I use it I get error 403 Forbidden When I use an API key that was on the example project I downloaded it is working fine. I saw that at the Google Developers Console I can enable a lot of api options, but didn't find any Speech-API option. Is there anything else I need to enable to get access to this API using my key? Thank you! 回答1: Instructions are

pyspeech (python) - Transcribe mp3 files?

点点圈 提交于 2020-01-09 11:14:10
问题 I'd like to transcribe mp3 (speech-to-text) using the pyspeech API. I don't know if this is possible, though. Is it? How? 回答1: pyspeech seems to be merely a python interface to the regular Windows speech APIs. Most likely you'd create some method of treating mp3 playback as an audio source for that speech API to listen to. 回答2: I don't know about pyspeech, but if it is a Python wrapper around the Microsoft speech APIs, then some other posts may be helpful. Microsoft Speech engines do not

pyspeech (python) - Transcribe mp3 files?

。_饼干妹妹 提交于 2020-01-09 11:13:27
问题 I'd like to transcribe mp3 (speech-to-text) using the pyspeech API. I don't know if this is possible, though. Is it? How? 回答1: pyspeech seems to be merely a python interface to the regular Windows speech APIs. Most likely you'd create some method of treating mp3 playback as an audio source for that speech API to listen to. 回答2: I don't know about pyspeech, but if it is a Python wrapper around the Microsoft speech APIs, then some other posts may be helpful. Microsoft Speech engines do not

Control other applications using Java?

天大地大妈咪最大 提交于 2020-01-07 02:38:08
问题 How can I control other applications using Java ? I'm using the Mary Speech Synthesizer(Open source, Java). It can synthesize speech well , but it requires the text to be in a textbox in the application window itself and then a button to be clicked . For this project of mine the text that needs to be realized is gonna be inbound from another java application . I need to know how I can place the text in the textbox and send a click to one of the buttons in the application . I'm hoping to

How does a shell script read the data in a batch test folder

最后都变了- 提交于 2020-01-05 14:03:13
问题 I recently replicated a SEGAN experiment based on TensorFlow0.12.1.The author provides a shell script for testing (clean_wav.sh), as shown in the figure below: This is the original version provided by the author. According to the path of my test data, the modified version is as follows: Noisy_testset_wav_16k is my test data folder, but running the script system will report an error: This folder is a directory, but when I change the path to: NOISY_WAVNAME='/home/zyf/SEGAN/ SEGAN/segan-master1

Speech recognition using Openears framework?

拈花ヽ惹草 提交于 2020-01-03 02:48:26
问题 Operears: The speech recognition(Speech to text) framework for iPhone(iOS Devices), I have installed openears demo app on my iPhone device, It works well but only for a list of words like GO, CHANGE, MODEL. Can we make speech recognition more generic for a real time speech recognition, that is, not limited to few words. It should be generic. Openears: http://www.politepix.com/openears/ 回答1: You have to use new Language Model instead of their default one. The language model is the vocabulary