How to encode resampled PCM-audio to AAC using ffmpeg-API when input pcm samples count not equal 1024

前端 未结 4 1618
情书的邮戳
情书的邮戳 2021-01-12 16:13

I am working on capturing and streaming audio to RTMP server at a moment. I work under MacOS (in Xcode), so for capturing audio sample-buffer I use AVFoundation-framework.

4条回答
  •  梦毁少年i
    2021-01-12 17:10

    I got a similar problem. I was encoding PCM packets to AAC while the length of PCM packets are sometimes smaller than 1024.

    If I encode the packet that's smaller than 1024, the audio will be slow. On the other hand, if I throw it away, the audio will get faster. swr_convert function didn't have any automatic buffering from my observation.

    I ended up with a buffer scheme that packets was filled to a 1024 buffer and the buffer gets encoded and cleaned everytime it's full.

    The function to fill buffer is below:

    // put frame data into buffer of fixed size
    bool ffmpegHelper::putAudioBuffer(const AVFrame *pAvFrameIn, AVFrame **pAvFrameBuffer, AVCodecContext *dec_ctx, int frame_size, int &k0) {
      // prepare pFrameAudio
      if (!(*pAvFrameBuffer)) {
        if (!(*pAvFrameBuffer = av_frame_alloc())) {
          av_log(NULL, AV_LOG_ERROR, "Alloc frame failed\n");
          return false;
        } else {
          (*pAvFrameBuffer)->format = dec_ctx->sample_fmt;
          (*pAvFrameBuffer)->channels = dec_ctx->channels;
          (*pAvFrameBuffer)->sample_rate = dec_ctx->sample_rate;
          (*pAvFrameBuffer)->nb_samples = frame_size;
          int ret = av_frame_get_buffer(*pAvFrameBuffer, 0);
          if (ret < 0) {
            char err[500];
            av_log(NULL, AV_LOG_ERROR, "get audio buffer failed: %s\n",
              av_make_error_string(err, AV_ERROR_MAX_STRING_SIZE, ret));
            return false;
          }
          (*pAvFrameBuffer)->nb_samples = 0;
          (*pAvFrameBuffer)->pts = pAvFrameIn->pts;
        }
      }
    
      // copy input data to buffer
      int n_channels = pAvFrameIn->channels;
      int new_samples = min(pAvFrameIn->nb_samples - k0, frame_size - (*pAvFrameBuffer)->nb_samples);
      int k1 = (*pAvFrameBuffer)->nb_samples;
    
      if (pAvFrameIn->format == AV_SAMPLE_FMT_S16) {
        int16_t *d_in = (int16_t *)pAvFrameIn->data[0];
        d_in += n_channels * k0;
        int16_t *d_out = (int16_t *)(*pAvFrameBuffer)->data[0];
        d_out += n_channels * k1;
    
        for (int i = 0; i < new_samples; ++i) {
          for (int j = 0; j < pAvFrameIn->channels; ++j) {
            *d_out++ = *d_in++;
          }
        }
      } else {
        printf("not handled format for audio buffer\n");
        return false;
      }
    
      (*pAvFrameBuffer)->nb_samples += new_samples;
      k0 += new_samples;
    
      return true;
    }
    

    And the loop for fill buffer and encode is below:

    // transcoding needed
    int got_frame;
    AVMediaType stream_type;
    // decode the packet (do it your self)
    decodePacket(packet, dec_ctx, &pAvFrame_, got_frame);
    
    if (enc_ctx->codec_type == AVMEDIA_TYPE_AUDIO) {
        ret = 0;
        // break audio packet down to buffer
        if (enc_ctx->frame_size > 0) {
            int k = 0;
            while (k < pAvFrame_->nb_samples) {
                if (!putAudioBuffer(pAvFrame_, &pFrameAudio_, dec_ctx, enc_ctx->frame_size, k))
                    return false;
                if (pFrameAudio_->nb_samples == enc_ctx->frame_size) {
                    // the buffer is full, encode it (do it yourself)
                    ret = encodeFrame(pFrameAudio_, stream_index, got_frame, false);
                    if (ret < 0)
                        return false;
                    pFrameAudio_->pts += enc_ctx->frame_size;
                    pFrameAudio_->nb_samples = 0;
                }
            }
        } else {
            ret = encodeFrame(pAvFrame_, stream_index, got_frame, false);
        }
    } else {
        // encode packet directly
        ret = encodeFrame(pAvFrame_, stream_index, got_frame, false);
    }
    

提交回复
热议问题