I did speaker detection among the set of people using Mouth aspect ratio with OpenCV and dlib. But since the mouth aspect ratio varies with the distance to the camera I want