MTCNN_face_detection_alignment lagging in IP camera, Convention behind opencv cv2 videocapture frame matrix

问题

I am just trying to detect and recognize faces from the frame read through CV2 VideoCapture. For detection, using Tensorflow implementation of the face detection / alignment algorithm found at https://github.com/kpzhang93/MTCNN_face_detection_alignment. MTCNN Face detection process has no lag with builtin webcam and external camera connected with USB. However when it comes from IP camera there is a considerable lagging from detection algorithm. The algorithm takes more time to process single frame from ip camera than a frame from built in camera. Parameters like image resolution, image details can have a impact. To understand it further, looking to know what are all the parameters have impact other than resolution and image details.

Noticed frame matrices value differs for builtin webcam and IP camera. It differs with linux vs windows. how the frame matrices values calculated? What are the parameters define a frame matrices value? Wondering how the frame matrix value always 0 for the frame from builtin webcam with windows OS.

Builtin webcam(Windows) resolution 480.0 640.0. Frame matrices printed in python video_capture = cv2.VideoCapture(0) ret, frame = video_capture.read() print(frame).

[[[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 ...

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]]

IP camera resolution 1080.0 1920.0. Similar way, printed below the IP camera Frame matrices

[[[ 85  81  64]
  [ 69  65  48]
  [ 61  57  40]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 [[ 74  70  53]
  [ 78  74  57]
  [ 70  66  49]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 [[ 72  68  51]
  [ 76  72  55]
  [ 73  69  52]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 ...

 [[ 74  74  67]
  [ 74  74  67]
  [ 75  75  68]
  ...
  [ 14  14  18]
  [ 21  21  25]
  [ 34  34  38]]

 [[ 74  74  67]
  [ 74  74  67]
  [ 75  75  68]
  ...
  [ 20  20  24]
  [ 27  27  31]
  [ 28  28  32]]

 [[ 74  74  67]
  [ 75  75  68]
  [ 75  75  68]
  ...
  [ 28  28  32]
  [ 28  28  32]
  [ 21  21  25]]]

回答1:

Your web camera might have first / last few lines empty for some technical reason. You may try to print average colors for every line and see, probably it would end up something like this:

np.mean( frame, axis=(1,2) )
[ 0, 0, 34, 42, .... 75, 129, 0, 0 ]

回答2:

consider to do not use all the frames that your recieve from your ip camera like:

_, frame = cap.read()
_, frame = cap.read()
_, frame = cap.read()
processFrame(frame)

or just use a delay to receive not all the frames

来源：https://stackoverflow.com/questions/57812858/mtcnn-face-detection-alignment-lagging-in-ip-camera-convention-behind-opencv-cv

标签

python

OpenCV

data-science

video-capture

face-detection