How to capture frames from a webcam with SharpDX

I'm trying to implement a webcam capture app which should take still frames, display them on the screen and save to the disk.

Since I'm using SharpDX already to capture the screen, I thought it would be nice to use that library. I was not sure if SharpDX had any video capture capabilities, so I started searching and found parts of what it looks like a webcam capture prototype:

var attributes = new MediaAttributes(1);
attributes.Set<Guid>(CaptureDeviceAttributeKeys.SourceType, CaptureDeviceAttributeKeys.SourceTypeVideoCapture.Guid);
var activates = MediaFactory.EnumDeviceSources(attributes);

var dic = new Dictionary<string, Activate>();
foreach (var activate in activates)
    var uid = activate.Get(CaptureDeviceAttributeKeys.SourceTypeVidcapSymbolicLink);
    dic.Add(uid, activate);

var camera = dic.First().Value;

It outputs camera with a strange uid. I'm not sure if it's correct.

What I am supposed to do after this?


I got this code kind of working. I still don't understand why the output is strange.

var attributes = new MediaAttributes(1);
attributes.Set(CaptureDeviceAttributeKeys.SourceType.Guid, CaptureDeviceAttributeKeys.SourceTypeVideoCapture.Guid);

var mediaSource = MediaFactory.EnumDeviceSources(attributes)[0].ActivateObject<MediaSource>();
mediaSource.CreatePresentationDescriptor(out var presentationDescriptor);

var reader = new SourceReader(mediaSource);
var mediaTypeIndex = 0;

int width, height;

using (var mt = reader.GetNativeMediaType(0, mediaTypeIndex))
    UnpackLong(mt.Get(MediaTypeAttributeKeys.FrameSize), out  width, out  height);
    UnpackLong(mt.Get(MediaTypeAttributeKeys.FrameRate), out var frameRateNumerator, out var frameRateDenominator);
    UnpackLong(mt.Get(MediaTypeAttributeKeys.PixelAspectRatio), out var aspectRatioNumerator, out var aspectRatioDenominator);

var sample = reader.ReadSample(SourceReaderIndex.AnyStream, SourceReaderControlFlags.None, out var readStreamIndex, out var readFlags, out var timestamp);

if (sample == null)
    sample = reader.ReadSample(SourceReaderIndex.AnyStream, SourceReaderControlFlags.None, out readStreamIndex, out readFlags, out timestamp);

var sourceBuffer = sample.GetBufferByIndex(0); // sample.ConvertToContiguousBuffer();
var sourcePointer = sourceBuffer.Lock(out var maxLength, out var currentLength);

var data = new byte[sample.TotalLength];
Marshal.Copy(sourcePointer, data, 0, sample.TotalLength);

var newData = new byte[width * 4 * height];

var partWidth = width / 4;
var partHeight = height / 3;

for (var i = 0; i < sample.TotalLength; i += 4)
    //X8R8B8G8 -> BGRA = 4
    newData[i] = data[i + 3];
    newData[i + 1] = data[i + 2];
    newData[i + 2] = data[i + 1];
    newData[i + 3] = 255; //data[i];

//var source = BitmapSource.Create(width, height, 96, 96, PixelFormats.Bgra32, null, data, ((width * 24 + 31) / 32) * 4);
var source = BitmapSource.Create(width, height, 96, 96, PixelFormats.Bgra32, null, newData, width * 4);


The output image is this (I was showing a color spectrum to my webcam):

The image is repeating 4 times, each part has a grayscale image and a color version with half the height. Two thirds of the image is transparent.


your output is NV12, here's some sample code to convert nv12 to rgb

    unsafe private static void TransformImage_NV12(IntPtr pDest, int lDestStride, IntPtr pSrc, int lSrcStride, int dwWidthInPixels, int dwHeightInPixels)
        uint imageWidth = (uint)dwWidthInPixels;
        uint widthHalf = imageWidth / 2;
        uint imageHeight = (uint)dwHeightInPixels;

        byte* nv12Data = (byte*)pSrc;
        byte* rgbData = (byte*)pDest;

        uint dataSize = imageWidth * imageHeight * 3;

        for (uint y = 0; y < imageHeight; y++)
            for (uint x = 0; x < imageWidth; x++)
                uint xEven = x & 0xFFFFFFFE;
                uint yEven = y & 0xFFFFFFFE;
                uint yIndex = y * imageWidth + x;
                uint cIndex = imageWidth * imageHeight + yEven * widthHalf + xEven;

                byte yy = nv12Data[yIndex];
                byte cr = nv12Data[cIndex + 0];
                byte cb = nv12Data[cIndex + 1];

                uint outputIndex = (dataSize - (y * imageWidth + x) * 3) - 3;

                rgbData[outputIndex + 0] = (byte)Math.Min(Math.Max((yy + 1.402 * (cr - 128)), 0), 255);
                rgbData[outputIndex + 1] = (byte)Math.Min(Math.Max((yy - 0.344 * (cb - 128) - 0.714 * (cr - 128)), 0), 255);
                rgbData[outputIndex + 2] = (byte)Math.Min(Math.Max((yy + 1.772 * (cb - 128)), 0), 255);

