Audio sample adding to a video using Sink Writer in Windows Media Foundation

问题

I can write a video file using images which I learned with this sample here. It uses IMFSample and IMFSinkWriter. Now I want to add audio to it. Suppose there is Audio.wma file and I want this audio to be written in that video file.

But cannot figure out how to do that in this sample. Things like input & output type setup, IMFSample creation for audio buffer etc. It would be a great if someone could show me how to add audio to a video file using sink writer.

回答1:

Media Foundation is great to work with, and I am certain you will be able to quickly modify your project to get this done.

OVERVIEW:
Create new IMFMediaSource to read the samples from the audio file, add an audio stream to the sink, and finally interleave the sink writes using the corresponding stream index.

DETAILS:

Modify the VideoGenerator::InitializeSinkWriter(..) function to properly initialize the sink to accommodate the audio stream. In that function, properly create the audioTypeOut and audioTypeIn (IMFMediaType). You may want to rename mediaTypeOut and mediaTypeIn to videoTypeOut and videoTypeIn for clarity, which would appear like the following:

ComPtr<IMFMediaType>  videoTypeOut;  // <-- previously mediaTypeOut
ComPtr<IMFMediaType>  videoTypeIn;   // <-- previously mediaTypeIn
ComPtr<IMFMediaType>  audioTypeOut = nullptr;
ComPtr<IMFMediaType>  audioTypeIn = nullptr;

Next, configure the output audio type compatible with your video type. Since you appear to be creating a windows media video, you will likely want to use MFAudioFormat_WMAudioV9. To ensure the channels, sample rate, and bits per sample are correct, I generally enumerate the available types and find the desired characteristics, similar to the following (error checking has been omitted):

ComPtr<IMFCollection> availableTypes = nullptr;
HRESULT hr = MFTranscodeGetAudioOutputAvailableTypes(MFAudioFormat_WMAudioV9, MFT_ENUM_FLAG_ALL, NULL, availableTypes.GetAddressOf());

DWORD count = 0;
hr = availableTypes->GetElementCount(&count));  // Get the number of elements in the list.

ComPtr<IUnknown>     pUnkAudioType = nullptr;
ComPtr<IMFMediaType> audioOutputType = nullptr;
for (DWORD i = 0; i < count; ++i)
{
    hr = availableTypes->GetElement(i, pUnkAudioType.GetAddressOf());
    hr = pUnkAudioType.Get()->QueryInterface(IID_PPV_ARGS(audioTypeOut.GetAddressOf()));

    // compare channels, sampleRate, and bitsPerSample to target numbers
    {
        // audioTypeOut is set!
        break;
    }

    audioOutputType.Reset();
}

availableTypes.Reset();

if audioTypeOut was set successfully, add that type of stream to the sink and get the resulting index:

hr = sinkWriter->AddStream(audioTypeOut.Get(), &audioStreamIndex);
audioTypeOut.Reset();  // <-- audioTypeOut not needed anymore

Finally for the sink, the audio input type must be set, and that will depend on the file you are reading, and the audio source (IMFMediaSource). More on that shortly, but adding the audio input to the sink would look similar to the following:

ComPtr<IMFMediaType> audioTypeIn = nullptr;  // <-- declaration from above
// NOTE: audioReader is an IMFMediaSource used to read the audio file
hr = audioReader->GetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_AUDIO_STREAM, audioTypeIn.GetAddressOf());
hr = sinkWriter->SetInputMediaType(_audioOutStreamIndex, audioTypeIn.Get(), nullptr);
audioTypeIn.Reset();

There are many examples available to create the audioReader (IMFMediaSource) and read samples from the file, but this one is simple and straight forward. The code is here.

Finally, writing the audio you will find to be really easy, since the sink can take the samples directly (IMFSample) that you read from the file. You get to manage the writes, but one solution is to interleave the writes (video / audio). The duration of the audio sample is handled, but you will need to rebase the timestamp. Ensure you have the correct stream index when writing to the sink.

Reading samples using and async callback:

// if you are using an async callback, the function would look similar to the following:
HRESULT OnReadAudioSample(HRESULT status, DWORD streamIndex, DWORD streamFlags, LONGLONG timestamp, IMFSample *sample)
{
    // .. other code
    hr = sample->SetSampleTime(timestamp - _baseRecordTime);
    hr = sinkWriter->WriteSample(audioStreamIndex, sample);
    // .. other code

    // trigger the next asyc read...
    hr = audioReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, nullptr, nullptr, nullptr, nullptr);
}

Reading samples synchronously:

// otherwise, you will only use a synchronous read
hr = audioReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, nullptr, &dwFlags, &timestamp, &sample);
hr = sample->SetSampleTime(timestamp - _baseRecordTime);
hr = sinkWriter->WriteSample(audioStreamIndex, sample);
hr = WriteFrame(target.get(), rtStart, rtDuration);  // <-- write video frame as before

Sounds like a fun little project. Good luck, have fun, and hope this helps!

来源：https://stackoverflow.com/questions/27352989/audio-sample-adding-to-a-video-using-sink-writer-in-windows-media-foundation

标签

audio

windows-runtime

c++-cx

ms-media-foundation