speech | 易学教程

Speech enabled asp.net application

阅读更多关于 Speech enabled asp.net application

问题 We are working on an asp.net web application that requires some data to be entered by speech. The user can enter some data using normal user interface however, we want an additional feature where he can enter data by speaking. We can fix the voice commands like to enter "value1" to "data1", user will speak "data1" followed by "value1" (or anything else, that can be fixed later). I searched over the internet and found that using Microsoft Speech SDK is a solution. We started with some initial

Speech to text sdk freezes after video playback

阅读更多关于 Speech to text sdk freezes after video playback

问题 I'm using speech-to-text sdk provided by https://github.com/todoroo/iPhone-Speech-To-Text Recognizer works just fine until the moment I playback a video using MPMoviePlayerController. Here is the code i'm using to call recognizer: - (IBAction)actionBtRecognition:(id)sender { if(recognizer == nil){ recognizer = [[SpeechToTextModule alloc] init]; } [recognizer beginRecording]; } To playback movie I used this tutorial So, once I playback a movie and call recognizer, it's just freezes. When I

React Native Speech to Text

阅读更多关于 React Native Speech to Text

I am making a language app that records any new vocabulary a user is trying to learn. It would be great if users can add their words via a speech to text program, instead of having to enter it manually. I am having trouble achieving this task. I am aware that there is an API for apple but not android. Is there anyway possible of doing this, using an API? Like for instance, google speech to text API? But I guess I would first have to be able to access the device's microphone. I am a beginner and this would be very easy using the web. Is React Native still too young for this task? You might

Google Speech Recognition API

阅读更多关于 Google Speech Recognition API

I'm trying to use the Google Speech API v2 (at address https://www.google.com/speech-api/v2/recognize?... ) I need to use my Api Key, but when I use it I get error 403 Forbidden When I use an API key that was on the example project I downloaded it is working fine. I saw that at the Google Developers Console I can enable a lot of api options, but didn't find any Speech-API option. Is there anything else I need to enable to get access to this API using my key? Thank you! Instructions are here : http://www.chromium.org/developers/how-tos/api-keys !! Do not forget to activate the API "Speech API"

text to phonemes converter

阅读更多关于 text to phonemes converter

I'm searching for a tool that converts text to phonemes, (like text to speech software) I can program one but it will not be without errors and takes a lot of time! so my question is: is there a simple tool for converting e.g. "hello" to "HH AH0 L OW1" maybe some command-line tool so i can capture the stdout? i'm searching for the phonemes in 'Arpabet' style (see the 'hello' example). espeak does something like that but the output is not in Arpabet style and the phonemes are not split by some determiner. If you had searched for Arpabet on wiki you would have found your answer. The CMU guys

React Native Speech to Text

阅读更多关于 React Native Speech to Text

问题 I am making a language app that records any new vocabulary a user is trying to learn. It would be great if users can add their words via a speech to text program, instead of having to enter it manually. I am having trouble achieving this task. I am aware that there is an API for apple but not android. Is there anyway possible of doing this, using an API? Like for instance, google speech to text API? But I guess I would first have to be able to access the device's microphone. I am a beginner

JAVA using google speech recognition API

阅读更多关于 JAVA using google speech recognition API

I'm trying to use google speech recognition API. Here's the code i've written: http://pastebin.com/zJEhnJ74 It works. I get an answer from the server: {"status":5,"id":"8803471b14a2310dfcf917754e8bd4a7-1","hypotheses":[]} Now the problem is "status:5". Infact, here's status code: status: 0 – correct , status: 4 – missing audio file,  status: 5 – incorrect audio file. My problem is "incorrect audio file". I don't understand if it is a .flac file error (you can download my test .flac file here: http://www21.zippyshare.com/v/61888405/file.html ) or how i read the file (in a byte array then

Audio analysis to detect human voice, gender, age and emotion — any prior open-source work done?

阅读更多关于 Audio analysis to detect human voice, gender, age and emotion — any prior open-source work done?

Is there prior open-source work done in the field of 'Audio analysis' to detect human-voice (say in spite of some background noise), determine speaker's gender, possibly determine no. of speakers, age of speaker(s), and the emotion of speakers? My hunch is that the speech recognition software like CMU Sphinx could be a good place to start, but if there's something better, it'd be great. I'm a graduate student doing speech recognition research. These are open research problems, and, unfortunately, I'm not aware of open-source packages that can do these things out of the box. If you have some

Android Speech Recognition not working

阅读更多关于 Android Speech Recognition not working

I'm using this example from newboston and it prompt me for recording but after it recognized what I said, it won't update the list view. Here is the code. public class MainActivity extends Activity { private static final int RECOGNIZER_RESULT = 1234; ListView list; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); list = (ListView) findViewById(R.id.list); Button btn_speach = (Button)findViewById(R.id.btn_speak); btn_speach.setOnClickListener(new OnClickListener() { public void onClick(View v) { Intent intent

API or SDK to make speech recognition only for numbers (between 1 and 10000)?

阅读更多关于 API or SDK to make speech recognition only for numbers (between 1 and 10000)?

I need a specialized solution optimized to detect numbers between 1 and 1000 to be used on a smartphone. Best solution would be to have this SDK working offline. Any idea ? I do not find any configuration with Google Speech or Amazon Transcribe to allow "number only" It is not quite right to strictly expect numbers from people, they usually say many things like "i don't know" or "wait a bit" even if you ask them for numbers. You will harm the experience significantly. You have to analyze the recognition result intelligently and even if non-number is recognized you have to act accordingly. To