I have a large set of vectors in 3 dimensions. I need to cluster these based on Euclidean distance such that all the vectors in any particular cluster have a Euclidean dista
Use OPTICS, which works well with large data sets.
OPTICS: Ordering Points To Identify the Clustering Structure Closely related to DBSCAN, finds core sample of high density and expands clusters from them 1. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN
from sklearn.cluster import OPTICS
db = OPTICS(eps=3, min_samples=30).fit(X)
Fine tune eps, min_samples as per your requirement.