I have an input of 36,742 points which means if I wanted to calculate the lower triangle of a distance matrix (using the vincenty approximation) I would need to generate 36,742*
"Use some kind of hashing to get a quick rough-cut off (all stores within 100km) and then only calculate accurate distances between those stores" I think this might be better called gridding. So first make a dict, with a set of coords as the key and put each shop in a 50km bucket near that point. then when you are calculating distances, you only look in nearby buckets, rather than iterate through each shop in the whole universe