Project and build structure for Microsoft DirectShow based virtual webcam application on Window 10

问题

I am trying to create simplest virtual webcam application which can display image file on my local filesystem.

After initial research on stackoverflow links and seeing OBS Studio source code I got some idea how can I achieve this.

I would need to use Microsoft DirectShow.
I would need to develop one source filter that would work as capture filter using IBaseFilter
I would need to develop another source filter that would work as output filter or virtual webcam filter. I would need to compile this filter as .dll file and will need to register using regsvr32.exe
As given on https://docs.microsoft.com/en-us/windows/win32/directshow/building-directshow-filters
I would need to create Filter Graph and Capture Filter Graph using CoCreateInstance like

hr = CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER, IID_IFilterGraph, (void **)&graph);

hr = CoCreateInstance(CLSID_CaptureGraphBuilder2, NULL, CLSCTX_INPROC_SERVER, IID_ICaptureGraphBuilder2, (void **)&builder);
Then I would need to add these filters to Filter Graph
Then I would set Filter Graph to Capture Filter Graph like hr = builder->SetFiltergraph(graph);

Here is my confusion now:
After these steps I am not sure if I have to wrap these Graph Filters and Capture Graph Filter in one application which would be having main method and compile it to get .exe file or I need to compile as another .dll file.

Or How should I wrap these steps to create final application?

回答1:

I want to create simplest virtual webcam application which can output any image or video to virtual camera. That virtual camera should be visible as video device in online meetings like Google meet or zoom.

There is no support for virtual web cameras in Windows as a unified API and what you are trying to achieve is, generally speaking, possible but far more complicated than a question of setup.

The task can be decomposed into three parts, and you will be able to find past StackOverflow questions that elaborate all of the three (some references are given below).

First, you need to resolve the problem of integration of a virtual camera into third party software. Per the statement I started from, the OS API offers no way for a generic virtual camera interface in terms of OS extensibility point that enables third party application "see" a new camera device.

A popular way to inject a fake camera device into applications is virtual DirectShow video source (and respectively Vivek's VCam source code).

The diagram from Registering a network video stream as a virtual camera describes the APIs used by applications to work with cameras and illustrates limitations of virtual DirectShow cameras, specifically why they are not visible by every video-enabled application in Windows.

See also questions Virtual Driver Cam not recognized by browser and DirectShow filter is not shown as input capture device.

All in all, to develop a virtual webcam for all and any application in Windows you would need to develop a driver, something few are ready to deal with.

Newer Media Foundation API offers nothing to help with functionality of virtual webcam.

Second, you need to define a method of injection of video frames into whatever virtual camera you develop. There is no need to use DirectShow or Media Foundation because in the end of the day all you need is to submit video frames to the back end of your virtual camera implementation and you are free to use any convenient method.

Use of DirectShow for this task make sense overall, but you don't need to. If you are not familiar with the API and you are starting with basics of creation of a filter graph, then it is quite likely that it is easier to go with a non-DirectShow solution. If you need to mix a real webcam image into your feed, you can capture it with Media Foundation in particular. If you plan to use GPU services of sorts, Media Foundation would be a better API to use again. DirectShow still remains good option as API to build your pipeline on.

Third, there is often a question of interprocess communication to connect the virtual camera implementation and the source of the video. In some cases it is not necessary, but more often it is just overlooked.

A virtual DirectShow camera (or virtual Media Foundation camera if you, for example, will be detouring) is running in context of camera consuming process, and cameras in general might be accessed from multiple applications including simultaneously. Quite so often you expect to produce video from another [single] application, including the case of application of unlatching bitness/architecture, so you are to take care of the challenge of passing data between the processes. If you are in an attempt to develop a driver for virtual camera you will have the same task too.

I mentioned aspects of this in MSDN question there: How to implement a "source filter" for splitting camera video based on Vivek's vcam?, then there Read USB camera's input edit and send the output to a virtual camera on Windows and also there How to create Directshow filter?.

All in all, it is not a question of project setup. Instead, it is a set of quite sophisticated problems to solve (which are doable though, and we see examples of this).

回答2:

DirectShow is outdated. You should be using Microsoft MediaFoundation instead. It is well documented and it works well. The following code will work to capture from webcam:

void Webcam::StartRecording()
{
    HRESULT hr = MFStartup(MF_VERSION);

    hr = MFCreateAttributes(&pConfig, 1);
    if (FAILED(hr)){
        std::cout << "Failed to create attribute store";
    }

    hr = pConfig->SetGUID(MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE, MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID);
    if (FAILED(hr)){
        std::cout << "Failed to request capture devices";
    }

    hr = MFEnumDeviceSources(pConfig, &ppDevices, &count);
    if (FAILED(hr)){
        std::cout << "Failed to enumerate capture devices";
    }

    hr = ppDevices[0]->ActivateObject(IID_PPV_ARGS(&pSource));
    if (FAILED(hr)){
        std::cout << "Failed to connect camera to source";
    }

    hr = MFCreateSourceReaderFromMediaSource(pSource, pConfig, &pReader);
    if (FAILED(hr)){
        std::cout << "Failed to create source reader";
    }

    IMFMediaType* pType = NULL;
    DWORD dwMediaTypeIndex = 0;
    DWORD dwStreamIndex = 0;
    hr = pReader->GetNativeMediaType(dwStreamIndex, dwMediaTypeIndex, &pType);
    LPVOID representation;
    pType->GetRepresentation(AM_MEDIA_TYPE_REPRESENTATION, &representation);
    GUID subType = ((AM_MEDIA_TYPE*)representation)->subtype;
    BYTE* pbFormat = ((AM_MEDIA_TYPE*)representation)->pbFormat;
    GUID formatType = ((AM_MEDIA_TYPE*)representation)->formattype;
    if (subType == MEDIASUBTYPE_YUY2) { std::cout << 1; };
    RECT rect;
    if (formatType == FORMAT_DvInfo) { std::cout << 1; }
    if (formatType == FORMAT_MPEG2Video) { std::cout << 2; }
    if (formatType == FORMAT_MPEGStreams) { std::cout << 3; }
    if (formatType == FORMAT_MPEGVideo) { std::cout << 4; }
    if (formatType == FORMAT_None) { std::cout << 5; }
    if (formatType == FORMAT_VideoInfo) { std::cout << 6; }
    if (formatType == FORMAT_VideoInfo2){
        rect = ((VIDEOINFOHEADER2*)pbFormat)->rcSource;
    }
    if (formatType == FORMAT_WaveFormatEx) { std::cout << 8; }
    if (formatType == GUID_NULL) { std::cout << 9; }

    int videoWidth = ((VIDEOINFOHEADER2*)pbFormat)->bmiHeader.biWidth;
    int videoHeight = ((VIDEOINFOHEADER2*)pbFormat)->bmiHeader.biHeight;

    IsRecording = true;
    DWORD streamIndex, flags;
    LONGLONG llTimeStamp;
    IMFSample* pSample;

    while (IsRecording){
        hr = pReader->ReadSample(MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, &streamIndex, &flags, &llTimeStamp, &pSample);
        if (FAILED(hr)){
            std::cout << "Failed to get image from camera";
        }
        if (pSample != NULL){
            IMFMediaBuffer* pBuffer;
            pSample->ConvertToContiguousBuffer(&pBuffer);
            unsigned char* data;
            DWORD length;
            pBuffer->GetCurrentLength(&length);
            HRESULT hr = pBuffer->Lock(&data, NULL, &length);
            if (FAILED(hr)){
                std::cout << "Failed to get data from buffer";
            }

            HDC hdc = GetDC(hwnd);
            HBITMAP bitmap = CreateCompatibleBitmap(hdc, 640, 480);
            BITMAPINFOHEADER header = { sizeof(BITMAPINFOHEADER), 640, 480, 1, 24, BI_RGB, 0, NULL, NULL, NULL, NULL };
            BITMAPINFO info = { header, NULL };
            SetDIBits(hdc, bitmap, 0, 480, &rgb[0], &info, DIB_RGB_COLORS);

            HIMAGELIST imageList = ImageList_Create(640, 480, ILC_COLOR24, 1, 500);
            if (bitmap != NULL) {
                ImageList_Add(imageList, bitmap, NULL);
                BOOL drawn = ImageList_Draw(imageList, 0, hdc, 0, 0, ILD_IMAGE);
                
                DeleteObject(bitmap);
            }
            else {
                std::cout << "Failed to create bitmap" << std::endl;
            }
            ImageList_Destroy(imageList);
            DeleteObject(hdc);
            pBuffer->Unlock();
            pBuffer->Release();
            pSample->Release();
        }
    }
    pSource->Stop();
    pSource = NULL;
    MFShutdown();
}

You may need to convert the data buffer to a RGB format before you send it to an image_list. Most modern cameras output in RGB but my older webcam on my laptop outputs YUY2. If you need to convert from YUY2 to RGB, feel free to ask. There probably is better code to do the same but this code works well. It leaves you in control of the image. You probably could show the image in a static control instead. image_list leaves you in control to add several images and resize them at will.

来源：https://stackoverflow.com/questions/65640023/project-and-build-structure-for-microsoft-directshow-based-virtual-webcam-applic

标签

video

com

video-capture

directshow

windows-10-desktop