speech-recognition

API or SDK for speech to text(speech recognition ) iphone

浪尽此生 提交于 2019-11-29 00:37:07
Hi I want to have a speech recognition api or sdk which recognises the speech spoken by the user and gives it's text form. Detailed Description is as follows: In my application I need to play an audio file and text of which is already there with me. When audio starts playing the word should be highlighted which is spoken(from the audio file). So if I am able to get the word from api or sdk then it is possible to highlight it. Apart from I googled a lot for api and I came across ceedvocalsdk but it's not available for free trial. If someone can provide any idea other than this suiting to my

Capturing audio sent to Google's speech recognition server

你离开我真会死。 提交于 2019-11-29 00:30:26
to recognize speech by Google server, I use SpeechRecognizer class in combination with RecognitionListener as suggested in Stephan 's answer to this question . In addition, I try to capture the audio signal being recognized by using onBufferReceived() callback from RecognitionListener like: byte[] sig = new byte[500000] ; int sigPos = 0 ; ... public void onBufferReceived(byte[] buffer) { System.arraycopy(buffer, 0, sig, sigPos, buffer.length) ; sigPos += buffer.length ; } ... This seems working fine, except when SpeechRecognizer fails connecting to the Google server, when a chunk of audio is

Python pocketsphinx RequestError: missing PocketSphinx module: ensure that PocketSphinx is set up correctly

折月煮酒 提交于 2019-11-28 23:43:48
I am trying to make a Python app that can record audio and translate it into english text using PyAudio, SpeechRecognition and PocketSphinx. I'm running on a Mac OS X El Capitan, version 10.11.2. Following a tutorial like this one and others, I've downloaded PyAudio version 0.2.9, SpeechRecognition as well as PocketSphinx. I've installed them into a Conda environment. I have followed the instructions from this site to use brew install swig git python on my OS X, hoping it would help. This is my code: # Load packages import speech_recognition as sr import sphinxbase import pocketsphinx # obtain

Server-side Voice Recognition [closed]

安稳与你 提交于 2019-11-28 21:40:00
问题 Anyone know of any good server side voice recognition engines that are already hosted? I.e. I want to be able to call a simple web API posting some sound data and get text back. Doesn't have to be free - but hopefully free to experiment with. 回答1: There are several IVR services which host an entire VOIP session (telephone call) as a complete application, rather than offer individual service transactions "àla carte". If you were to make your program look like a VOIP call, you might be able to

Mac OS X speech to text API. Howto?

我怕爱的太早我们不能终老 提交于 2019-11-28 19:47:53
I have a program that receives an audio (mono) stream of bits from TCP/IP. I am wondering whether the speech (speech-recognition) API in Mac OS X would be able to do a speech-to-text transform for me. (I don't mind saving the audio into .wav first and read it as oppose to do the transform on the fly). I have read the official docs online, it is a bit confusing. And I couldn't find any good example about this topic. Also, should I do it in Cocoa/Carbon/Java or Objective-C? Can someone please shed some light? Thanks. There's a number of examples that get copied under /Developer/Examples/Speech

Convert audio to text

做~自己de王妃 提交于 2019-11-28 18:27:26
I just want to know if there is any build in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it. I need to make an application to do so, but I don't know from where I can start. Here are some of your options: Microsoft Speech Lumenvox Dragon naturally speaking sphinx4 Here is a complete example using C# and System.Speech The code can be divided into 2 main parts: configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events. Step 1: Configuring the

iPhone: Speech Recognition is in IOS SDK available?

帅比萌擦擦* 提交于 2019-11-28 17:31:23
问题 Does anyone knows that if "speech to text" and "text to speech" api's used in Siri are accessible in IOS 5 or IOS 6 SDK? I researched but couldn't find anything about it in documentation, so if thats not included in SDK are there any "Siri" quality libraries in market? 回答1: Siri is not available in API form yet, however, any UITextField or UITextArea can be dictated to using the built-in option for speech-to-text. 回答2: Check out Openears at: http://www.politepix.com/openears I've used it

Is there an API for Google's speech recognition technology? [closed]

被刻印的时光 ゝ 提交于 2019-11-28 17:17:08
I want to try creating a jQuery slideshow using simple voice commands like "next" or "previous". Is there a way to use Google's voice recognition? I know about Chrome's x-webkit-speech, but I have to click a button to use it. I tried MIT's WAMI, but I found it slower and less accurate than Google's speech cognition. As of today this now exists, for Chrome: http://chrome.blogspot.co.uk/2013/01/hello-browser.html ( api doc ) For an easy way to do this with JavaScript, check out annyang , which is a library that makes dealing with speech recognition super-easy. The issue is what will capture your

Speech Recognition API

萝らか妹 提交于 2019-11-28 17:06:59
I need to automatically transcribe some short MP3s as part of a proof of concept I am working on. I am currently looking into cloud solutions or web API services to send the MP3 as a simple HTTP request and receive a transcription back. The only free/open source solution I have found here , but the demos don't seem to work (at least not on the files I need to transcribe). I have found some enterprise solutions for call centers, but so far nothing I can simply integrate into a project. Are there any web based speech recognition services available? One that is able to filter out small noise

example of AlwaysOnHotwordDetector in Android

为君一笑 提交于 2019-11-28 16:23:36
Can someone provide an example of how to use the new AlwaysOnHotwordDetector class in Android? I'd like to build an app, that when the app is running in the background, can detect a hotword like "next", or "back", or "pause". Unless I have a huge blind spot, I don't think third-party applications can make use of this API. Its strange that AlwaysOnHotwordDetector (and related classes VoiceInteractionService etc.) have been granted public access. If you are building a privileged app , look through these test projects from AOSP: Voice Interaction - Basic AlwaysOnHotwordDetector usage