Here is an idea:
We have web applications with exposed restful APIs which accepts json. Now how about using google speech APIs to take user voice input convert it to tex
According to the Google Speech API the result set is already returned in JSON:
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.98267895
}
]
}
]
}
All you would have to do is use JSON.parse and then select whatever you wanted out of the object to put into your specific json format.
I would suggest reading through the Google Speech Documentation