If you don't need the full distance matrix, you will be better off using kd-tree. Consider scipy.spatial.cKDTree or sklearn.neighbors.KDTree. This is because a kd-tree kan find k-nearnest neighbors in O(n log n) time, and therefore you avoid the O(n**2) complexity of computing all n by n distances.