Splitting an Ogg Opus File stream

最后都变了- 提交于 2021-02-10 05:38:30

问题


I am trying to send an OGG_OPUS encoded stream to google's speech to text streaming service. Since there is a time limit imposed by Google for their stream requests, I have to route the audio stream to another Google Speech To Text streaming session on a fixed interval.

From what I've read, the pages in the OGG stream cannot be read independently since the data in the pages are calculated by considering the data of the previous and next pages. If that is the case, can we cut off the stream at a certain point and recreate a brand new stream with the remaining data? Stopping at a certain point and sending the data in a new stream just doesn't work because the initial OGG header packets are also no available in the second stream.

I know that this issue can be solved using PCM data, since its not encoded, a PCM stream can simply be split at any point and turned into a new stream. I cannot use a PCM stream due to the heavy bitrate, also I prefer not to use lossless quality since I'm transferring a voice data stream.

Refs: https://tools.ietf.org/html/rfc7845#section-3


回答1:


OpusFileSplitter can split Opus audio files.

The Ogg pages can be read independently as long as the file starts with the Beginning of Stream (BOS) header and comment page. You can split one Ogg file into multiple files by creating new files that start with the Ogg header page and have Ogg data/audio pages after . For example, this Ogg Opus file:

*********************************************************
*          *              *              *              *
*  Header  *  Audio Data  *  Audio Data  *  Audio Data  *
*   Page   *    Page 1    *    Page 2    *    Page 3    *
*          *              *              *              *
*********************************************************

Could be split into 2 files:

***************************
*          *              *
*  Header  *  Audio Data  *
*   Page   *    Page 1    *
*          *              *
***************************

******************************************
*          *              *              *
*  Header  *  Audio Data  *  Audio Data  *
*   Page   *    Page 2    *    Page 3    *
*          *              *              *
******************************************

You're correct regarding audio segments that could be split and span across multiple pages. I'm assuming that a few milliseconds could be lost if a page contains incomplete audio segments, but that should not disrupt speech recognition. Unfortunately, my local tests used Opus files generated by opusenc util, which didn't create pages that split segments across pages, which seems to be a good thing for splitting files!

OpusFileSplitter.scanPages() shows how to find the page boundaries.



来源:https://stackoverflow.com/questions/58274671/splitting-an-ogg-opus-file-stream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!