Kinect - Map (x, y) pixel coordinates to “real world” coordinates using depth

萝らか妹 提交于 2020-01-21 02:26:45

问题


I'm working on a project that uses the Kinect and OpenCV to export fintertip coordinates to Flash for use in games and other programs. Currently, our setup works based on color and exports fingertip points to Flash in (x, y, z) format where x and y are in Pixels and z is in Millimeters.

But, we want map those (x, y) coordinates to "real world" values, like Millimeters, using that z depth value from within Flash.

As I understand, the Kinect 3D depth is obtained via projecting the X-axis along the camera's horizontal, it's Y-axis along the camera's vertical, and it's Z-axis directly forward out of the camera's lens. Depth values are then the length of the perpendicular drawn from any given object to the XY-plane. See the picture in the below link (obtained from microsoft's website).

Microsoft Depth Coordinate System Example

Also, we know that the Kinect's horizontal field of vision is projected in a 117 degree angle.

Using this information, I figured I could project the depth value of any given point onto the x=0, y=0 line and draw a horizontal line parallel to the XY-plane at that point, intersecting the camera's field of vision. I end up with a triangle, split in half, with a height of the depth of an object in question. I can then solve for the width of the field of view using a little trigonometry. My equation is:

W = tan(theta / 2) * h * 2

Where:

  • W = Field of view Width
  • theta = Horizontal field of view Angle (117 degrees)
  • h = Depth Value

(Sorry, I can't post a picture, I would if I could)

Now, solving for a depth value of 1000mm (1 meter), gives a value of about 3264mm.

However, when actually LOOKING at the camera image produced I get a different value. Namely, I placed a meter stick 1 meter away from the camera and noticed that the width of the frame was at most 1.6 meters, not the estimated 3.264 meters from calculations.

Is there something I'm missing here? Any help would be appreciated.


回答1:


The depth stream is correct. You should indeed take the depth value, and then from the Kinect sensor, you can easily locate the point in the real world relative to the Kinect. This is done by simple trigonometry, however you must keep in mind that the depth value is the distance from the Kinect "eye" to the point measured, so it is a diagonal of a cuboid.

Actually, follow this link How to get real world coordinates (x, y, z) from a distinct object using a Kinect

It's no use rewriting, there you have the right answer.




回答2:


A few things:

A) I know you got the 117 degree FOV from a function in the Kinect sensor, but I still don't believe that's correct. That's a giant FOV. I actually got the same number when I ran the function on my Kinect, but I still don't believe it. While 57 (or 58.5 from some sources) seems low, it's definitely more reasonable. Try putting the Kinect on a flat surface and places object just inside its view and measure the FOV that way. Not precise, but I don't think you'll find it to be over 100 degrees.

B) I saw an article demonstrating the actual distance vs Kinect's reported depth; it's not linear. This wouldn't actually affect your 1.6 meter trig issue, but it's something to keep in mind going forward.

C) I would strongly suggest changing your code to accept the real world points from the Kinect. Better yet, just send over more data if that's possible. You can continue to provide the current data, and just tack the real world coordinate data onto that.




回答3:


Vector subtraction should get you the distance between any two points given by the Kinect. You'll have to look up the best way to perform Vector subtraction in your specific environment, but I hope this helps anyway. In Processing, which I use, there's a PVector class, where to subtract you simply go PVector difference = PVector.sub(vector1, vector2), where vector1 and vector2 are the vectors representing your two points, and difference is the new vector between the two points. You then require the magnitude of the difference vector. Again, in processing, this is simply found by magnitude = difference.mag(). That magnitude should be your desired distance.

Here's a great rundown of both vectors in processing, and vectors in general: https://processing.org/tutorials/pvector/



来源:https://stackoverflow.com/questions/11784888/kinect-map-x-y-pixel-coordinates-to-real-world-coordinates-using-depth

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!