Output 50 samples closest to each cluster center using scikit-learn.k-means library
问题 I have fitted a k-means algorithm on 5000+ samples using the python scikit-learn library. I want to have the 50 samples closest to a cluster center as an output. How do I perform this task? 回答1: If km is the k-means model, the distance to the j 'th centroid for each point in an array X is d = km.transform(X)[:, j] This gives an array of len(X) distances. The indices of the 50 closest to centroid j are ind = np.argsort(d)[::-1][:50] so the 50 points closest to the centroids are X[ind] (or use