Speech recognition in C or Java or PHP? [closed]

与世无争的帅哥 提交于 2019-11-29 19:54:32
Michael Levy

From watching these questions for few months, I've seen most developer choices break down like this:

Windows folks - use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx. More background on Microsoft engines for Windows What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?

Linux folks - Sphinx seems to have a good following. See http://cmusphinx.sourceforge.net/ and http://cmusphinx.sourceforge.net/wiki/

Commercial products - Nuance, Loquendo, AT&T, others

Online service - Nuance, Yapme, others

Of course this may also be helpful - http://en.wikipedia.org/wiki/List_of_speech_recognition_software

There is a Java speech API. See javax.speech.recognition in the Java Speech API http://java.sun.com/products/java-media/speech/forDevelopers/jsapi-guide/Recognition.html. I believe you still have to find a speech engine that supports this API. I don't think Sphinx fully supports it - http://cmusphinx.sourceforge.net/sphinx4/doc/Sphinx4-faq.html#support_jsapi

There are lots of other SO quesitons: Need text to speech and speech recognition tools for Linux

Hmm. An interesting topic. I haven't done any work around this sort of thing in ages, though I did spend quite a bit of time playing with some (fairly basic) speech recognition software on the Amiga many years ago. It's good fun, but not nearly as easy as your pseudo-code example makes it sound.

You're going to need a third party API library for this. (I guess it's possible to write your own, but I don't think you're as the point where that's a feasible idea)

There are a number of API libraries available; Google turned up several -- here's one of the results I got: http://en.wikipedia.org/wiki/Microsoft_Speech_API -- but you'll probably need to try a few till you get one which meets your needs.

The chances are it's going to be a commercial API -- ie you'll have to pay for it. There may be some open source ones (I didn't see any in my cursory Googleing, but I'm sure they exist), but they're likely to be a lot harder to use.

Once you've got a library that you're happy with, and you've written your code to interface with it, your work isn't done, because speech recognition is a notoriously tricky thing to work with.

Different accents are just the start of the problem. The gender of the speaker and the speed at which they speak also affect the ability to recognise what has been said. Humans are far better at recognising speech than computers, but even we struggle with some unfamiliar accents.

Speech recognition software typically needs to be trained to recognise specific words and phrases. You certainly wouldn't try to match against a string as in your example; you'd ask it to spot a specific one of the phrases it had been trained to recognise.

In short, it's a very big field, which you're clearly only just dipping your toe into. I hope it goes well for you, but I see a lot of research time in your immediate future!

Here are some other links which may help you:

Try my C library, libsprec, which is built around Google's speech recognition engine:

http://github.com/H2CO3/libsprec

HTK is one of the more popular frameworks for C.

http://htk.eng.cam.ac.uk/

It is not easily used, but definitely is powerful.

The J.A.R.V.I.S. Java Speech API is very robust and functional and a great minimalist alternative to Sphinx.

https://github.com/The-Shadow/java-speech-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!