In a system level programming language like C, C++ or D, what is the best type/encoding for storing latitude and longitude?
The options I see are:
After coming across this question after searching for an answer myself, here is another possible scheme based on some precedent.
The Network Working Group, RFC 3825 proposed a coordinate-based geographic location option for DHCP (ie the system that hands out IP addresses on a network). See https://tools.ietf.org/rfc/rfc3825.txt
In this scheme, latitude and longitude are encoded in degrees with fixed-point values where the first 9 bits are the signed degrees, 25 bits are fractional degrees, and 6 bits are used for the accuracy. The value of the accuracy bits indicates the number of the 25 fractional bits that are considered to be accurate (e.g. coordinates collected via a consumer GPS vs a high-precision surveyor's GPS). Using WGS84, the accuracy is 8 decimal digits which is good to about a millimeter regardless of where you are on the globe.
As a couple of others have posted, floating point encoding really isn't good for this type of thing. Yes, it can represent a very large number of decimal places but the accuracy is either ignored or has to be dealt with somewhere else. For example, printing a float or a double with full floating-point precision results in a number with decimal digits very very unlikely to be remotely accurate. Likewise, simply outputting a float or a double with 8 or 10 decimal digits of precision many not be a true representation of the source values based on how floating point numbers are computed (e.g. why 1.2-1.0 does not equal 0.2 using floating point arithmetic).
For for a humorous example of why you should care about coordinate-system precision, see https://xkcd.com/2170/.
Granted, the 40-bit encoding used in RFC 3825 is hardly convenient in a 32 or 64-bit world but this style can be easily extended to a 64-bit number where 9 bits are used for the signed degree, 6 bits are used for the accuracy, leaving 49 bits for the decimal portion. This results in 15 decimal digits of precision which is more than basically anyone will ever need (see humorous example).
You can use decimal
datatype:
CREATE TABLE IF NOT EXISTS `map` (
`latitude` decimal(18,15) DEFAULT NULL,
`longitude` decimal(18,15) DEFAULT NULL
);
A Decimal representation with precision of 8 should be more than enough according to this wikipedia article on Decimal Degrees.
0 decimal places, 1.0 = 111 km
...
7 decimal places, 0.0000001 = 1.11 cm
8 decimal places, 0.00000001 = 1.11 mm
A Java program for comuting max rounding error in meters from casting lat/long values into Float/Double:
import java.util.*;
import java.lang.*;
import com.javadocmd.simplelatlng.*;
import com.javadocmd.simplelatlng.util.*;
public class MaxError {
public static void main(String[] args) {
Float flng = 180f;
Float flat = 0f;
LatLng fpos = new LatLng(flat, flng);
double flatprime = Float.intBitsToFloat(Float.floatToIntBits(flat) ^ 1);
double flngprime = Float.intBitsToFloat(Float.floatToIntBits(flng) ^ 1);
LatLng fposprime = new LatLng(flatprime, flngprime);
double fdistanceM = LatLngTool.distance(fpos, fposprime, LengthUnit.METER);
System.out.println("Float max error (meters): " + fdistanceM);
Double dlng = 180d;
Double dlat = 0d;
LatLng dpos = new LatLng(dlat, dlng);
double dlatprime = Double.longBitsToDouble(Double.doubleToLongBits(dlat) ^ 1);
double dlngprime = Double.longBitsToDouble(Double.doubleToLongBits(dlng) ^ 1);
LatLng dposprime = new LatLng(dlatprime, dlngprime);
double ddistanceM = LatLngTool.distance(dpos, dposprime, LengthUnit.METER);
System.out.println("Double max error (meters): " + ddistanceM);
}
}
Output:
Float max error (meters): 1.7791213425235692
Double max error (meters): 0.11119508289500799
Longitudes and latitudes are not generally known to any greater precision than a 32-bit float. So if you're concerned about storage space, you can use floats. But in general it's more convenient to work with numbers as doubles.
Radians are more convenient for theoretical math. (For example, the derivative of sine is cosine only when you use radians.) But degrees are typically more familiar and easier for people to interpret, so you might want to stick with degrees.
What encoding is "best" really depends on your goals/requirements.
If you are performing arithmetic, floating point latitude,longitude is often quite convenient. Other times cartesian coordinates (ie x,y,z) can be more convenient. For example, if you only cared about points on the surface of earth, you could use an n-vector.
As for longer term storage, IEEE floating point will waste bits for ranges you don't care about (for lat/lon) or for precision you may not care about in the case of cartesian coordinates (unless you want very good precision at the origin for whatever reason). You can of course map either type of coordinates to ints of your preferred size, such that the entire range of said ints covers the range you are interested in at the resolution you care about.
There are of course other things to think about than merely not wasting bits in the encoding. For example, (Geohashes)[https://en.wikipedia.org/wiki/Geohash] have the nice property that it is easy to find other geohashes in the same area. (Most will have the same prefix, and you can compute the prefix the others will have.) Unfortunately, they maintain the same precision in degrees longitude near the equator as near the poles. I'm currently using 64-bit geohashes for storage, which gives about 3 m resolution at the equator.
The Maidenhead Locator System has some similar characteristics, but seems more optimized for communicating locations between humans rather than storing on a computer. (Storing MLS strings would waste a lot of bits for some rather trivial error detection.)
The one system I found that does handle the poles differently is the Military Grid Reference System, although it too seems more human-communications oriented. (And it seems like a pain to convert from or to lat/lon.)
Depending on what you want exactly, you could use something similar to the Universal polar sereographic coordinate system near the poles along with something more computationally sane than UTM for the rest of the world, and use at most one bit to indicate which of the two systems you're using. I say at most one bit, because it's unlikely most of the points you care about would be near the poles. For example, you could use "half a bit" by saying 11 indicates use of the polar system, while 00, 01, and 10 indicate use of the other system, and are part of the representation.
Sorry this is a bit long, but I wanted to save what I had learned recently. Sadly I have not found any standard, sane, and efficient way to represent a point on earth with uniform precision.
Edit: I found another approach which looks a lot more like what you wanted, since it more directly takes advantage of the lower precision needed for longitude closer to the poles. It turns out there is a lot of research on storing normal vectors. Encoding Normal Vectors using Optimized Spherical Coordinates describes such a system for encoding normal vectors while maintaining a minimum level of accuracy, but it could just as well be used for geographical coordinates.