How to end Google Speech-to-Text streamingRecognize gracefully and get back the pending text results?

后端 未结 3 1220
心在旅途
心在旅途 2020-12-20 20:58

I\'d like to be able to end a Google speech-to-text stream (created with streamingRecognize), and get back the pending SR (speech recognition) results.

In a nutshell,

3条回答
  •  暖寄归人
    2020-12-20 21:37

    My bad — unsurprisingly, this turned to be an obscure race condition in my code.

    I've put together a self-contained sample that works as expected (gist). It helped me tracking down the issue. Hopefully, it may help others and my future self:

    // A simple streamingRecognize workflow,
    // tested with Node v15.0.1, by @noseratio
    
    import fs from 'fs';
    import path from "path";
    import url from 'url'; 
    import util from "util";
    import timers from 'timers/promises';
    import speech from '@google-cloud/speech';
    
    export {}
    
    // need a 16-bit, 16KHz raw PCM audio 
    const filename = path.join(path.dirname(url.fileURLToPath(import.meta.url)), "sample.raw");
    const encoding = 'LINEAR16';
    const sampleRateHertz = 16000;
    const languageCode = 'en-US';
    
    const request = {
      config: {
        encoding: encoding,
        sampleRateHertz: sampleRateHertz,
        languageCode: languageCode,
      },
      interimResults: false // If you want interim results, set this to true
    };
    
    // init SpeechClient
    const client = new speech.v1p1beta1.SpeechClient();
    await client.initialize();
    
    // Stream the audio to the Google Cloud Speech API
    const stream = client.streamingRecognize(request);
    
    // log all data
    stream.on('data', data => {
      const result = data.results[0];
      console.log(`SR results, final: ${result.isFinal}, text: ${result.alternatives[0].transcript}`);
    });
    
    // log all errors
    stream.on('error', error => {
      console.warn(`SR error: ${error.message}`);
    });
    
    // observe data event
    const dataPromise = new Promise(resolve => stream.once('data', resolve));
    
    // observe error event
    const errorPromise = new Promise((resolve, reject) => stream.once('error', reject));
    
    // observe finish event
    const finishPromise = new Promise(resolve => stream.once('finish', resolve));
    
    // observe close event
    const closePromise = new Promise(resolve => stream.once('close', resolve));
    
    // we could just pipe it: 
    // fs.createReadStream(filename).pipe(stream);
    // but we want to simulate the web socket data
    
    // read RAW audio as Buffer
    const data = await fs.promises.readFile(filename, null);
    
    // simulate multiple audio chunks
    console.log("Writting...");
    const chunkSize = 4096;
    for (let i = 0; i < data.length; i += chunkSize) {
      stream.write(data.slice(i, i + chunkSize));
      await timers.setTimeout(50);
    }
    console.log("Done writing.");
    
    console.log("Before ending...");
    await util.promisify(c => stream.end(c))();
    console.log("After ending.");
    
    // race for events
    await Promise.race([
      errorPromise.catch(() => console.log("error")), 
      dataPromise.then(() => console.log("data")),
      closePromise.then(() => console.log("close")),
      finishPromise.then(() => console.log("finish"))
    ]);
    
    console.log("Destroying...");
    stream.destroy();
    console.log("Final timeout...");
    await timers.setTimeout(1000);
    console.log("Exiting.");
    

    The output:

    Writting...
    Done writing.
    Before ending...
    SR results, final: true, text:  this is a test I'm testing voice recognition This Is the End
    After ending.
    data
    finish
    Destroying...
    Final timeout...
    close
    Exiting.
    

    To test it, a 16-bit/16KHz raw PCM audio file is required. An arbitrary WAV file wouldn't work as is because it contains a header with metadata.

提交回复
热议问题