I am looking to create an app which has Speech to text.
I am aware of this kind of ability using the RecognizerIntent: http://android-developers.blogspot.com/search/
What is built into Android (that you launch via the intent) is a client activity that captures your voice and sends the audio to a Google server for recognition. You could build something similar. You could host sphinx yourself (or use cloud recognition services like Yapme.com), capture the voice yourself, send the audio to a recognizer, and get back text results to your app. I don't know of a way to leverage the Google recognition services without use of the Intent on Android (or through Chrome).
The general consensus I've seen so far is that today's smartphones don't really have the horsepower to do Sphinx-like speech recognition. You may want to explore running a client recognizer for yourself, but Google uses server recognition.
For some related info see: