WasapiLoopbackCapture internal audio recognition gives jibberish and text when no audio

前端 未结 1 1296
面向向阳花
面向向阳花 2020-12-21 00:56

I finally have built a program to listen to the internal audio loopback using NAudio, and output recognized text. The problem is it listens, and always says, eg:

<         


        
相关标签:
1条回答
  • 2020-12-21 01:34

    That SpeechStreamer class has some problems, I cannot really see what its purpose is. I tried. Also looking at wavefile dumps from your implementation, the audio is really choppy, with long pauses between the samples. This might be what is throwing the speech recognizer off. This is an example: Windows Volume Adjutment Sound From Your Code

    As you may hear, it is pretty choppy with a lot of silence between. The Voice Recognition part recognizes this as : "ta ta ta ta ta ta..."

    I had to rewrite your code a bit to dump a wave file, since the Read method of your SpeechStream causes an eternal loop when used to read its contents.

    To dump a wave file you could do the following:

    var buffer = new byte[2048];
    using (var writer = new WaveFileWriter("tmp.wav", ieeeToPcm.WaveFormat))
    {
        //buffStream is changed to a MemoryStream for this to work.
        buffStream.Seek(0,SeekOrigin.Begin);
    
        while (buffStream.Read(buffer, 0, buffer.Length)>0)
        {
            writer.Write(buffer, 0, buffer.Length);
        }
    }
    

    Or you can do it when you read from your SampleToWaveProvider16:

    var writer = new WaveFileWriter("dump.wav", ieeeToPcm.WaveFormat);
    while (ieeeToPcm.Read(arr, 0, arr.Length) > 0)
    {
        if (Console.KeyAvailable && Console.ReadKey().Key == ConsoleKey.Escape)
            break;
        buffStream.Write(arr, 0, arr.Length);
        writer.Write(arr, 0, arr.Length);
    }
    

    I just added the ability to hit Escape to exit the loop.

    Now I do wonder why you are using NAudio? Why not use the methods native to the Sound.Speech API?

    class Program
    {
        private static ManualResetEvent _done;
        static void Main(string[] args)
        {
            _done = new ManualResetEvent(false);
    
            using (SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(new CultureInfo("en-US")))
            {
                recognizer.LoadGrammar(new DictationGrammar());
                recognizer.SpeechRecognized += RecognizedSpeech;
                recognizer.SetInputToDefaultAudioDevice();
                recognizer.RecognizeAsync(RecognizeMode.Multiple);
                _done.WaitOne();
            }
        }
    
        private static void RecognizedSpeech(object sender, SpeechRecognizedEventArgs e)
        {
            if (e.Result.Text.Contains("exit"))
            {
                _done.Set();
            }
    
            Console.WriteLine(e.Result.Text);
        }
    }
    
    0 讨论(0)
提交回复
热议问题