Extracting h264 from CMBlockBuffer

前端 未结 2 1851
挽巷
挽巷 2020-12-07 17:15

I am using Apple VideoTool Box (iOS) to compress raw frames captured by the device camera.

My callback is being called with a CMSampleBufferRef object that contains

相关标签:
2条回答
  • 2020-12-07 17:34

    I've been struggling with this myself for quite some time now, and have finally figured everything out.

    The function CMBlockBufferGetDataPointer gives you access to all the data you need, but there are a few not very obvious things you need to do to convert it to an elementary stream.

    AVCC vs Annex B format

    The data in the CMBlockBuffer is stored in AVCC format, while elementary streams are typically following the Annex B specification (here is an excellent overview of the two formats). In the AVCC format, the 4 first bytes contains the length of the NAL unit (another word for H264 packet). You need to replace this header with the 4 byte start code: 0x00 0x00 0x00 0x01, which functions as a separator between NAL units in an Annex B elementary stream (the 3 byte version 0x00 0x00 0x01 works fine too).

    Multiple NAL units in a single CMBlockBuffer

    The next not very obvious thing is that a single CMBlockBuffer will sometimes contain multiple NAL units. Apple seems to add an additional NAL unit (SEI) containing metadata to every I-Frame NAL unit (also called IDR). This is probably why you are seeing multiple buffers in a single CMBlockBuffer object. However, the CMBlockBufferGetDataPointer function gives you a single pointer with access to all the data. That being said, the presence of multiple NAL units complicates the conversion of the AVCC headers. Now you actually have to read the length value contained in the AVCC header to find the next NAL unit, and continue converting headers until you have reached the end of the buffer.

    Big-Endian vs Little-Endian

    The next not very obvious thing is that the AVCC header is stored in Big-Endian format, and iOS is Little-Endian natively. So when you are reading the length value contained in an AVCC header pass it to the CFSwapInt32BigToHost function first.

    SPS and PPS NAL units

    The final not very obvious thing is that the data inside the CMBlockBuffer does not contain the parameter NAL units SPS and PPS, which contains configuration parameters for the decoder such as profile, level, resolution, frame rate. These are stored as metadata in the sample buffer's format description and can be accessed via the function CMVideoFormatDescriptionGetH264ParameterSetAtIndex. Note that you have to add the start codes to these NAL units before sending. The SPS and PPS NAL units does not have to be sent with every new frame. A decoder only needs to read them once, but it is common to resend them periodically, for example before every new I-frame NAL unit.

    Code Example

    Below is a code example taking all of these things into account.

    static void videoFrameFinishedEncoding(void *outputCallbackRefCon,
                                           void *sourceFrameRefCon,
                                           OSStatus status,
                                           VTEncodeInfoFlags infoFlags,
                                           CMSampleBufferRef sampleBuffer) {
        // Check if there were any errors encoding
        if (status != noErr) {
            NSLog(@"Error encoding video, err=%lld", (int64_t)status);
            return;
        }
    
        // In this example we will use a NSMutableData object to store the
        // elementary stream.
        NSMutableData *elementaryStream = [NSMutableData data];
    
    
        // Find out if the sample buffer contains an I-Frame.
        // If so we will write the SPS and PPS NAL units to the elementary stream.
        BOOL isIFrame = NO;
        CFArrayRef attachmentsArray = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, 0);
        if (CFArrayGetCount(attachmentsArray)) {
            CFBooleanRef notSync;
            CFDictionaryRef dict = CFArrayGetValueAtIndex(attachmentsArray, 0);
            BOOL keyExists = CFDictionaryGetValueIfPresent(dict,
                                                           kCMSampleAttachmentKey_NotSync,
                                                           (const void **)&notSync);
            // An I-Frame is a sync frame
            isIFrame = !keyExists || !CFBooleanGetValue(notSync);
        }
    
        // This is the start code that we will write to
        // the elementary stream before every NAL unit
        static const size_t startCodeLength = 4;
        static const uint8_t startCode[] = {0x00, 0x00, 0x00, 0x01};
    
        // Write the SPS and PPS NAL units to the elementary stream before every I-Frame
        if (isIFrame) {
            CMFormatDescriptionRef description = CMSampleBufferGetFormatDescription(sampleBuffer);
    
            // Find out how many parameter sets there are
            size_t numberOfParameterSets;
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
                                                               0, NULL, NULL,
                                                               &numberOfParameterSets,
                                                               NULL);
    
            // Write each parameter set to the elementary stream
            for (int i = 0; i < numberOfParameterSets; i++) {
                const uint8_t *parameterSetPointer;
                size_t parameterSetLength;
                CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
                                                                   i,
                                                                   &parameterSetPointer,
                                                                   &parameterSetLength,
                                                                   NULL, NULL);
    
                // Write the parameter set to the elementary stream
                [elementaryStream appendBytes:startCode length:startCodeLength];
                [elementaryStream appendBytes:parameterSetPointer length:parameterSetLength];
            }
        }
    
        // Get a pointer to the raw AVCC NAL unit data in the sample buffer
        size_t blockBufferLength;
        uint8_t *bufferDataPointer = NULL;
        CMBlockBufferGetDataPointer(CMSampleBufferGetDataBuffer(sampleBuffer),
                                    0,
                                    NULL,
                                    &blockBufferLength,
                                    (char **)&bufferDataPointer);
    
        // Loop through all the NAL units in the block buffer
        // and write them to the elementary stream with
        // start codes instead of AVCC length headers
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4;
        while (bufferOffset < blockBufferLength - AVCCHeaderLength) {
            // Read the NAL unit length
            uint32_t NALUnitLength = 0;
            memcpy(&NALUnitLength, bufferDataPointer + bufferOffset, AVCCHeaderLength);
            // Convert the length value from Big-endian to Little-endian
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            // Write start code to the elementary stream
            [elementaryStream appendBytes:startCode length:startCodeLength];
            // Write the NAL unit without the AVCC length header to the elementary stream
            [elementaryStream appendBytes:bufferDataPointer + bufferOffset + AVCCHeaderLength
                                   length:NALUnitLength];
            // Move to the next NAL unit in the block buffer
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
    }   
    
    0 讨论(0)
  • 2020-12-07 17:52

    Thanks Anton for an excellent answer! Am putting a naive Swift-port of your solution for people interested in using the concepts discussed here straight in their Swift-based projects.

    public func didEncodeFrame(frame: CMSampleBuffer)
    {
        print ("Received encoded frame in delegate...")
    
        //----AVCC to Elem stream-----//
        var elementaryStream = NSMutableData()
    
        //1. check if CMBuffer had I-frame
        var isIFrame:Bool = false
        let attachmentsArray:CFArray = CMSampleBufferGetSampleAttachmentsArray(frame, false)!
        //check how many attachments
        if ( CFArrayGetCount(attachmentsArray) > 0 ) {
            let dict = CFArrayGetValueAtIndex(attachmentsArray, 0)
            let dictRef:CFDictionaryRef = unsafeBitCast(dict, CFDictionaryRef.self)
            //get value
            let value = CFDictionaryGetValue(dictRef, unsafeBitCast(kCMSampleAttachmentKey_NotSync, UnsafePointer<Void>.self))
            if ( value != nil ){
                print ("IFrame found...")
                isIFrame = true
            }
        }
    
        //2. define the start code
        let nStartCodeLength:size_t = 4
        let nStartCode:[UInt8] = [0x00, 0x00, 0x00, 0x01]
    
        //3. write the SPS and PPS before I-frame
        if ( isIFrame == true ){
            let description:CMFormatDescriptionRef = CMSampleBufferGetFormatDescription(frame)!
            //how many params
            var numParams:size_t = 0
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description, 0, nil, nil, &numParams, nil)
    
            //write each param-set to elementary stream
            print("Write param to elementaryStream ", numParams)
            for i in 0..<numParams {
                var parameterSetPointer:UnsafePointer<UInt8> = nil
                var parameterSetLength:size_t = 0
                CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description, i, &parameterSetPointer, &parameterSetLength, nil, nil)
                elementaryStream.appendBytes(nStartCode, length: nStartCodeLength)
                elementaryStream.appendBytes(parameterSetPointer, length: unsafeBitCast(parameterSetLength, Int.self))
            }
        }
    
        //4. Get a pointer to the raw AVCC NAL unit data in the sample buffer
        var blockBufferLength:size_t = 0
        var bufferDataPointer: UnsafeMutablePointer<Int8> = nil
        CMBlockBufferGetDataPointer(CMSampleBufferGetDataBuffer(frame)!, 0, nil, &blockBufferLength, &bufferDataPointer)
        print ("Block length = ", blockBufferLength)
    
        //5. Loop through all the NAL units in the block buffer
        var bufferOffset:size_t = 0
        let AVCCHeaderLength:Int = 4
        while (bufferOffset < (blockBufferLength - AVCCHeaderLength) ) {
            // Read the NAL unit length
            var NALUnitLength:UInt32 =  0
            memcpy(&NALUnitLength, bufferDataPointer + bufferOffset, AVCCHeaderLength)
            //Big-Endian to Little-Endian
            NALUnitLength = CFSwapInt32(NALUnitLength)
            if ( NALUnitLength > 0 ){
                print ( "NALUnitLen = ", NALUnitLength)
                // Write start code to the elementary stream
                elementaryStream.appendBytes(nStartCode, length: nStartCodeLength)
                // Write the NAL unit without the AVCC length header to the elementary stream
                elementaryStream.appendBytes(bufferDataPointer + bufferOffset + AVCCHeaderLength, length: Int(NALUnitLength))
                // Move to the next NAL unit in the block buffer
                bufferOffset += AVCCHeaderLength + size_t(NALUnitLength);
                print("Moving to next NALU...")
            }
        }
        print("Read completed...")
    }
    
    0 讨论(0)
提交回复
热议问题