Prototype based on speech recognition

这一生的挚爱 提交于 2019-11-27 08:40:55

问题


I want to create a prototype that's based on automatic speech recognition in order to deal with reports.

The requirements aren't sure right now, but at first I'll get some dummy data sets. And at first I'll concentrate on the input of acoustic signals and the further processing.

I don't really know how to start, which development environment, programming language, ...

I would prefer to work with visual studio because I have already a license, but I'm open-minded to proposal.

Do you have some tutorials, ideas, experience?


回答1:


(I am reusing an email I sent to a friend recently. I hope it is helpful)

Microsoft has two flavors of speech engines: Desktop and Server. The desktop speech engine has shipped with various products including: MS Office 2003, Windows Vista, and Windows 7. The server speech engine has shipped with Office Communications Server (OCS) and the Unified Communications Managed API (UCMA).

The desktop speech engine usually ships with a dictation grammar. It is optimized for desktop use and can be shared from multiple processes. This would let you use a single instance of the desktop recognizer and issue voice commands to both Excel and Word. The desktop recognizer can be programmed via the COM SAPI api or the .NET System.Speech namespace.

The server speech engine does not ship with any grammar. It is optimized for server use. I believe it is optimized for telephony use as well. It is designed for high volume scenarios. The server speech engine can be programmed via the COM SAPI api or the .NET Microsoft.Speech namespace.

The server speech engine is packaged into a new free redistributable package called “The Microsoft Server Speech Platform”. I assume that the next version of OCS (product named Lync - http://www.microsoft.com/en-us/lync/default.aspx) will also include the same Microsoft Server Speech Platform.

The Microsoft Server Speech Platform is available as a free redistributable download. It has three pieces: SDK, Runtime, and languages. There are 26 languages available. See http://blogs.msdn.com/b/speak/archive/2010/03/30/microsoft-server-speech-platform-10-1-released-sr-and-tts-in-26-languages.aspx for some background. Since that blog post, Microsoft has quietly released an updated 10.2 version of the Microsoft Server Speech Platform. They are also available for download at:

SDK: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4&displaylang=en

Runtime: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=bb0f72cb-b86b-46d1-bf06-665895a313c7&displaylang=en

Languages: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=47ffd4e5-e682-4228-8058-dd895252a3c3&displaylang=en



来源:https://stackoverflow.com/questions/3865351/prototype-based-on-speech-recognition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!