问题
I am looking for an effective way to grab image data off video files. I am currently testing FilgraphManagerClass.GetCurrentImage()
from the Interop.QuartzTypeLib
library. This does what I need but is painfully slow. I need to process all frames of each video. What better options do I have?
Requirements
- Must be frame accurate. <-- Very important!
- Gives me access to the decoded pixel buffer (array of
int
orbyte[]
), ideally RGB24 or RGB32. - The buffer can be grabbed in realtime or faster. I do not need to display the video, I only need to analyze the pixels.
- Handle mp4 files (h264/aac). I can rewrap or frame serve via AviSynth if needed but no retranscoding can be involved.
Any suggestions would be welcome.
Some code as requested:
FilgraphManagerClass graphClass = new FilgraphManagerClass();
graphClass.RenderFile(@"C:\tmp\tmp.avs");
int sz = (graphClass.Width * graphClass.Height + 10) * 4;
int[] buffer = new int[sz - 1];
I am then stepping through each frame. I have something like this in the loop:
graphClass.GetCurrentImage(ref sz, out buffer[0]);
//DoStuff(buffer);
graphClass.CurrentPosition += graphClass.AvgTimePerFrame;
回答1:
IBasicVideo::GetCurrentImage method you are using is basically intended for snapshots, and works with legacy video rendering in legacy modes only. That is, (a) it is NOT time accurate, it can get you duplicate frames or, the opposite, lose frames; and (b) it assumes that you display video.
Instead you want to build a filter graph of the following kind: File Source -> ... -> Sample Grabber Filter -> Null Renderer. Sample Grabber, a standard component, can be provided with a callback so that it calls you with any frame data that comes through it.
Then you remove clock from the graph by calling SetReferenceClock(null)
on the filter graph so that it run as fast as possible (as opposed to realtime). Then you Run
the graph and all video frames are supplied to your callback.
To accomplish the task in C# you need to use DirectShow.NET library. It's Capture\DxSnap
sample provides a brief example how to use Sample Grabber. They do it through BufferCB
instead of SampleCB
and it works well too. Other samples there are also using this approach.
You will find other code snippets very close to this task:
- Seeking keyframes using DirectShowNet - use of Sample Grabber
- BufferCB not being called by SampleGrabber - same task for audio part
- How to access an audio stream using DirectShow.NET C#
Regarding MP4
files you should take into consideration the following:
- Support for MPEG-4 is limited in Windows, and you might need third party components installed to make the files playable. If GraphEdit can read them, then you can too.
- Windows Media Player might be using, and is likely to, a newer API and you should rather look at GraphEdit
- Be sure to use
Win32
/x86
platform on your application to avoid running into scenario that your app is running in 64-bit domain, while support for MP4 only exists in 32-bit components/libraries installed
回答2:
You could also look at creating an allocator-presenter using Windows Media Foundation. This will give you the decoded video frame as a GPU texture and you could also use CUDA or OpenCL to perform the processing required (if possible) which would help your processing speed immensely.
来源:https://stackoverflow.com/questions/11593140/efficiently-grabbing-pixels-from-video