sapi | 易学教程

Training sapi : Creating transcripted wav files and adding file paths to registry

阅读更多关于 Training sapi : Creating transcripted wav files and adding file paths to registry

问题 We are trying to do acoustic training but we are unable to create the transcripted audio files, how to create it? Also we are using GetTranscript and Appendtranscript but we are unable to get the ISpTranscript interface for the ISpStream if we open the stream in READWRITE mode, so how do you create the transcript wav files. hr = SPBindToFile(L"e:\\file1.wav", SPFM_OPEN_READONLY, &cpStream); hr = cpStream.QueryInterface(&cpTranscript); // We get a error here for as E_NONINTERFACE if SPFM_OPEN

Microsoft speech API 5.1, 5.3?

阅读更多关于 Microsoft speech API 5.1, 5.3?

I'm a little confuse between the different SAPI version available. First of all, I only find the SDK to develop with the 5.1 version, is there any SDK for the 5.3 version available, if not, why ? Witch version can I use if I'm developing with the 3.5 version of the .Net framework. Is there any good tutorial because the only one I found are pretty old (they use 2003 version of visual studio) : http://msdn.microsoft.com/en-us/library/ms986944.aspx Is there any way I can use the speech API directly in Asp.Net web site in speech-to-text mode ? Thx! Wikipedia tells me that SAPI 5.3 was included in

How to fix compiler errors in SAPI 5.1 Header Files

阅读更多关于 How to fix compiler errors in SAPI 5.1 Header Files

I got a lot of errors from SAPI 5.1 provided header files and cannot figure out how to fix those problems. Following is a simple Text to Speech program from Microsoft’s How to Video Presentation . The presenter said, if you have installed the most updated packages, you will have no problem compile this program. But he is using Video Studio 2005; apparently the “most updated” refers a few years ago when the presentation was given. I think these errors are caused version miss match. I am using Windows XP SP3. I have Visual Studio 2008 SP1, Visual Studio 2008 SDK 1.1, Windows SDK v6.0A(come with

C# SAPI 5.4 Languages?

阅读更多关于 C# SAPI 5.4 Languages?

问题 I've made a Simple Program That Recognizes Speech Using SAPI 5.4 , i wanted to ask if i can add some more languages to the TTS and The ASR , Thanks Here is the code i made you anybody needs to take a look at it using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Drawing; using System.Linq; using System.Text; using System.Windows.Forms; using SpeechLib; using System.Globalization; using System.Speech.Recognition; namespace

Convert audio (wav file) to text using SAPI?

阅读更多关于 Convert audio (wav file) to text using SAPI?

问题 My task is to convert an Audio file not from Direct Speech from Human into text. e.g If I have "Hello there" store in wav file to it will transcribe it into text and show "Hello there" string on screen. Any language code in preferred but priority is C#. 回答1: SAPI can certainly do what you want. Start with an in-proc recognizer, connect up your audio as a file stream, set dictation mode, and off you go. Now the disappointing bit. You probably won't get terribly good results; in fact, I suspect

深入理解php底层：php生命周期

阅读更多关于深入理解php底层：php生命周期

1、PHP的运行模式： PHP两种运行模式是WEB模式、CLI模式。无论哪种模式，PHP工作原理都是一样的，作为一种SAPI运行。 1、当我们在终端敲入php这个命令的时候，它使用的是CLI。它就像一个web服务器一样来支持php完成这个请求，请求完成后再重新把控制权交给终端。 2、当使用Apache或者别web服务器作为宿主时，当一个请求到来时，PHP会来支持完成这个请求。一般有：多进程(通常编译为apache的模块来处理PHP请求) 多线程模式 2、一切的开始: SAPI接口通常我们编写php Web程序都是通过Apache或者Nginx这类Web服务器来测试脚本. 或者在命令行下通过php程序来执行PHP脚本. 执行完成脚本后，服务器应答，浏览器显示应答信息,或者在命令结束后在标准输出显示内容. 我们很少关心PHP解释器在哪里. 虽然通过Web服务器和命令行程序执行脚本看起来很不一样. 实际上她们的工作是一样的. 命令行程序和Web程序类似, 命令行参数传递给要执行的脚本,相当于通过url 请求一个PHP页面. 脚本戳里完成后返回响应结果,只不过命令行响应的结果是显示在终端上. 脚本执行的开始都是通过SAPI接口进行的. 1)、启动apache ：当给定的SAPI启动时，例如在对/usr/local/apache/bin/apachectl start的响应中

Microsoft Speech Recognition Custom Training

阅读更多关于 Microsoft Speech Recognition Custom Training

问题 I have been wanting to create an application using the Microsoft Speech Recognition. My application's users are expected to often say abbreviated things, such as 'LHC' for 'Large Hadron Collider' or 'CERN'. Given that exact order, my application will return You said: At age C. You said: Cern While it did work for 'CERN', it failed very badly for 'LHC'. However, if I could make my own custom training files, I could easily place the term 'LHC' somewhere in there. Then, I could make the user

Speech training files and registry locations

阅读更多关于 Speech training files and registry locations

问题 I have a speech project that requires acoustic training to be done in code. I a successfully able to create training files with transcripts and their associated registry entries under Windows 7 using SAPI. However, I am unable to determine if the Recognition Engine is successfully using these files and adapting its model. My questions are as follows: When performing training through the Control Panel training UI, the system stores the training files in "{AppData}\Local\Microsoft\Speech\Files

Free-form text with custom SRGS based Grammar

阅读更多关于 Free-form text with custom SRGS based Grammar

I am trying to develop a Voice based application that would accept user input as speech and perform some actions based on the input. This is my first ever venture into this technology and I am learning while developing it. I am using Microsoft SAPI shipped with dotnet 4 to recognize speech. So far, I have learned about the two types of modes it supports. Speech recognition (SR) has two modes of operation: Dictation mode — an unconstrained, free-form speech interpretation mode that uses a built-in grammar provided by the recognizer for a specific language. This is the default recognizer.

Can C# SAPI speak SSML string?

阅读更多关于 Can C# SAPI speak SSML string?

I implemented a TTS in my C# WPF project. Previously, I use the TTS in System.Speech.Synthesis namespace to speak. The speaking content is in SSML format (Speech Synthesizer Markup Language, support customize the speaking rate, voice, emphasize) like following: <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><prosody rate="x-fast">hello world. This is a long sentence speaking very fast!</prosody></speak> But unfortunately the System.Speech.Synthesis TTS has a memory leak problem, as I mentioned in question Memory leak in .Net Speech.Synthesizer? . So I decide