I\'m working on a project that uses the Kinect and OpenCV to export fintertip coordinates to Flash for use in games and other programs. Currently, our setup works based on
Vector subtraction should get you the distance between any two points given by the Kinect. You'll have to look up the best way to perform Vector subtraction in your specific environment, but I hope this helps anyway. In Processing, which I use, there's a PVector class, where to subtract you simply go PVector difference = PVector.sub(vector1, vector2), where vector1 and vector2 are the vectors representing your two points, and difference is the new vector between the two points. You then require the magnitude of the difference vector. Again, in processing, this is simply found by magnitude = difference.mag(). That magnitude should be your desired distance.
Here's a great rundown of both vectors in processing, and vectors in general: https://processing.org/tutorials/pvector/
A few things:
A) I know you got the 117 degree FOV from a function in the Kinect sensor, but I still don't believe that's correct. That's a giant FOV. I actually got the same number when I ran the function on my Kinect, but I still don't believe it. While 57 (or 58.5 from some sources) seems low, it's definitely more reasonable. Try putting the Kinect on a flat surface and places object just inside its view and measure the FOV that way. Not precise, but I don't think you'll find it to be over 100 degrees.
B) I saw an article demonstrating the actual distance vs Kinect's reported depth; it's not linear. This wouldn't actually affect your 1.6 meter trig issue, but it's something to keep in mind going forward.
C) I would strongly suggest changing your code to accept the real world points from the Kinect. Better yet, just send over more data if that's possible. You can continue to provide the current data, and just tack the real world coordinate data onto that.
The depth stream is correct. You should indeed take the depth value, and then from the Kinect sensor, you can easily locate the point in the real world relative to the Kinect. This is done by simple trigonometry, however you must keep in mind that the depth value is the distance from the Kinect "eye" to the point measured, so it is a diagonal of a cuboid.
Actually, follow this link How to get real world coordinates (x, y, z) from a distinct object using a Kinect
It's no use rewriting, there you have the right answer.