Let\'s say I have a data structure like the following:
Camera {
double x, y, z
/** ideally the camera angle is positioned to aim at the 0,0,0 point */
Following the wikipedia, first calculate "d":
http://upload.wikimedia.org/wikipedia/en/math/6/0/b/60b64ec331ba2493a2b93e8829e864b6.png
In order to do this, build up those matrices in your code. The mappings from your examples to their variables:
θ = Camera.angle*
a = SomePointIn3DSpace
c = Camera.x | y | z
Or, just do the equations separately without using matrices, your choice:
http://upload.wikimedia.org/wikipedia/en/math/1/c/8/1c89722619b756d05adb4ea38ee6f62b.png
Now we calculate "b", a 2D point:
http://upload.wikimedia.org/wikipedia/en/math/2/5/6/256a0e12b8e6cc7cd71fa9495c0c3668.png
In this case ex and ey are the viewer's position, I believe in most graphics systems half the screen size (0.5) is used to make (0, 0) the center of the screen by default, but you could use any value (play around). ez is where the field of view comes into play. That's the one thing you were missing. Choose a fov angle and calculate ez as:
ez = 1 / tan(fov / 2)
Finally, to get bx and by to actual pixels, you have to scale by a factor related to the screen size. For example, if b maps from (0, 0) to (1, 1) you could just scale x by 1920 and y by 1080 for a 1920 x 1080 display. That way any screen size will show the same thing. There are of course many other factors involved in an actual 3D graphics system but this is the basic version.