Windows Media Foundation using IMFTransform to decode mp4 movie frames to 2D textures

我的梦境 提交于 2019-12-04 09:47:45

I see that you have some mistake in understanding of Media Foundation. You want get image in RGB format from MFVideoFormat_H264, but you do not use decoder H264. You wrote "I've tired using the IMFTransform class" - IMFTransform is not class. It is interface for Transform COM objects. You must create COM object Media Foundation H264 decoder. The CLSID for the Microsoft software H264 decoder is CLSID_CMSH264DecoderMFT. However, from that decoder you can get output image in the next formats: Output Types

MFVideoFormat_I420

MFVideoFormat_IYUV

MFVideoFormat_NV12

MFVideoFormat_YUY2

MFVideoFormat_YV12

You can create D3D11_TEXTURE2D from one of them. Or you can do something like this from my project CaptureManager SDK:

                CComPtrCustom<IMFTransform> lColorConvert;

                if (!Result(lColorConvert.CoCreateInstance(__uuidof(CColorConvertDMO))))
                {
                    lresult = MediaFoundationManager::setInputType(
                        lColorConvert,
                        0,
                        lVideoMediaType,
                        0);

                    if (lresult)
                    {
                        break;
                    }

                    DWORD lTypeIndex = 0;

                    while (!lresult)
                    {

                        CComPtrCustom<IMFMediaType> lOutputType;

                        lresult = lColorConvert->GetOutputAvailableType(0, lTypeIndex++, &lOutputType);

                        if (!lresult)
                        {


                            lresult = MediaFoundationManager::getGUID(
                                lOutputType,
                                MF_MT_SUBTYPE,
                                lSubType);

                            if (lresult)
                            {
                                break;
                            }

                            if (lSubType == MFVideoFormat_RGB32)
                            {
                                LONG lstride = 0;

                                MediaFoundationManager::getStrideForBitmapInfoHeader(
                                    lSubType,
                                    lWidth,
                                    lstride);

                                if (lstride < 0)
                                    lstride = -lstride;

                                lBitRate = (lHight * (UINT32)lstride * 8 * lNumerator) / lDenominator;

                                lresult = MediaFoundationManager::setUINT32(
                                    lOutputType,
                                    MF_MT_AVG_BITRATE,
                                    lBitRate);

                                if (lresult)
                                {
                                    break;
                                }


                                PROPVARIANT lVarItem;

                                lresult = MediaFoundationManager::getItem(
                                    *aPtrPtrInputMediaType,
                                    MF_MT_FRAME_RATE,
                                    lVarItem);

                                if (lresult)
                                {
                                    break;
                                }

                                lresult = MediaFoundationManager::setItem(
                                    lOutputType,
                                    MF_MT_FRAME_RATE,
                                    lVarItem);

                                if (lresult)
                                {
                                    break;
                                }

                                (*aPtrPtrInputMediaType)->Release();

                                *aPtrPtrInputMediaType = lOutputType.detach();

                                break;
                            }
                        }
                    }
                }

You can set ColorConvertDMO for converting from output format of the H264 decoder into the needed one of you.

Also, you can view code by link: videoInput. This code takes live video from web cam and decode it into the RGB. If you replace web cam source on mp4 video file source you will get the solution which is close to your need.

Regards

Is this type of conversion possible?

Yes it is possible. Stock H.264 Video Decoder MFT is "Direct3D aware" which means it can decode video into Direct3D 9 surfaces/Direct3D 11 textures leveraging DXVA. Or, if hardware capabilities are insufficient there is a software fallback mode too. You are interested in getting the output delivered right into texture for performance reasons (otherwise you would have to load this data yourself spending CPU and video resources on that).

Can this be done through the IMFTransform/SourceReader classes like I've tired above and do I just need to tweak the code or do I need to do this type of conversion manually?

IMFTransform is abstract interface. It is implemented by H.264 decoder (as well as other MFTs) and you can use it directly, or you can use higher level Source Reader API to get it manage video reading from file and decoding using this MFT.

That is, MFT and Source Reader are not actually exclusive alternate option but instead a higher and lower level APIs. MFT interface is offered by decoder and you are responsible to feed H.264 in and drain the decoded output. Source Reader manages the same MFT and adds file reading capability.

Source Reader itself is available in Windows 7, BTW (even on Vista, might be limited in feature set compared to newer OSes though).

Decoding can be executed by the next code:

                    MFT_OUTPUT_DATA_BUFFER loutputDataBuffer;

                    initOutputDataBuffer(
                        lTransform,
                        loutputDataBuffer);

                    DWORD lprocessOutputStatus = 0;

                    lresult = lTransform->ProcessOutput(
                        0,
                        1,
                        &loutputDataBuffer,
                        &lprocessOutputStatus);

                    if ((HRESULT)lresult == E_FAIL)
                    {
                        break;
                    }

function initOutputDataBuffer allocates the needed memory. Example of that function is presented there:

            Result initOutputDataBuffer(IMFTransform* aPtrTransform,
            MFT_OUTPUT_DATA_BUFFER& aRefOutputBuffer)
        {
            Result lresult;

            MFT_OUTPUT_STREAM_INFO loutputStreamInfo;

            DWORD loutputStreamId = 0;

            CComPtrCustom<IMFSample> lOutputSample;

            CComPtrCustom<IMFMediaBuffer> lMediaBuffer;

            do
            {
                if (aPtrTransform == nullptr)
                {
                    lresult = E_POINTER;

                    break;
                }

                ZeroMemory(&loutputStreamInfo, sizeof(loutputStreamInfo));

                ZeroMemory(&aRefOutputBuffer, sizeof(aRefOutputBuffer));

                lresult = aPtrTransform->GetOutputStreamInfo(loutputStreamId, &loutputStreamInfo);

                if (lresult)
                {
                    break;
                }

                if ((loutputStreamInfo.dwFlags & MFT_OUTPUT_STREAM_PROVIDES_SAMPLES) == 0 &&
                    (loutputStreamInfo.dwFlags & MFT_OUTPUT_STREAM_CAN_PROVIDE_SAMPLES) == 0)
                {
                    lresult = MFCreateSample(&lOutputSample);

                    if (lresult)
                    {
                        break;
                    }

                    lresult = MFCreateMemoryBuffer(loutputStreamInfo.cbSize, &lMediaBuffer);

                    if (lresult)
                    {
                        break;
                    }

                    lresult = lOutputSample->AddBuffer(lMediaBuffer);

                    if (lresult)
                    {
                        break;
                    }

                    aRefOutputBuffer.pSample = lOutputSample.Detach();
                }
                else
                {
                    lresult = S_OK;
                }

                aRefOutputBuffer.dwStreamID = loutputStreamId;
            } while (false);

            return lresult;
        }

It needs get information about output samples via GetOutputStreamInfo method of IMFTransform. MFT_OUTPUT_STREAM_INFO contains info about the needed size of memory for output media sample - cbSize. It needs to allocate memory with that size, adds it into the MediaSample and attaches it to th MFT_OUTPUT_DATA_BUFFER.

So, you see that writing code for encoding and decoding video via direct calling of the MediaFoundation function can be difficult and needs significant knowledge about it. From description of you task I see that you need only decode video and present it. I can advise you try use Media Foundation Session functionality. It is developed by engineers of Microsoft and already includes algorithms for using of the needed encoders and optimized. In project videoInput Media Foundation Session is used for finding the suitable decoder for Media Source which is created for web camera and grabbing of the frames in uncompressed format. It is already do the needed processing. You need only replace Media Source from web camera on Media Source from video file. It could by more easy then writing code with direct calling of IMFTransform for decoding and allows to simplify many problems (for example - stabilizing of frame rate. If code will render image immediately after decoding and then decode new frame then it can render 1 minutes video clip during a couple seconds, or if rendering of video and other content can take more than one frame duration video can be presented in "Slow motion" style and rendering of the 1 minute video clip can take 2, 3 or 5 minutes. I do not know for what project you need decoding of video, but you should have serious reasons for using code with direct calling of the Media Foundation functions and interfaces.

Regards.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!