问题
A 4K60Hz RGB video can have bandwith up to 2GB/s, PCs based on DDR3 have 25.6GB/s theoretical RAM bandwidth but the real performance can be far more lower like 10GB/s. If the video is first captured to system RAM than copied to VRAM for display, there will be 4GB/s bandwidth utilization on system RAM which consume too much bandwidth.
To increase the efficiency, one-copy mode is needed. One way is to copy captured data to VRAM (e.g. discrete GPU with 100GB/s+ VRAM bandwidth) directly, but with this https://docs.microsoft.com/en-us/windows-hardware/drivers/stream/capturing-video-to-vram-using-avstream, it says:
To capture to VRAM, a device must include capture and display functionality on the same video card.
but almost every video capture device(e.g. USB or PCIe) is not on the GPU card.
Can this be implemented? (e.g. with a PCIe 4K video capturing card)
There is nVidia GPUDirect but that is for professional platforms and limited to NV GPUs.
Here is a paper about transfer data between FPGA and GPU https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/2012062520UCAA2012_Bittner_Ruf_Final.pdf, but (1) Still NV cuda API based, not based on Windows DDK. (2) It mentions that the GPU should act as the bus master and FPGA is the slave, this seems only works when GPU read data from FPGA's frame buffer RAM, but the best video capture card architecture is RAMless i.e. the video capture card act as PCIe master and after receiving some data like a line of RGB pixels then send this to GPU in bursts.
More references:
http://www.ertl.jp/~shinpei/papers/icpads13.pdf
DMA over PCIe to other device (Linux)
来源:https://stackoverflow.com/questions/58761166/high-efficiency-way-to-capture-rgb-video-and-display-it-in-windows-10