K-means algorithm variation with equal cluster size

前端 未结 16 959
挽巷
挽巷 2020-11-27 14:26

I\'m looking for the fastest algorithm for grouping points on a map into equally sized groups, by distance. The k-means clustering algorithm looks straightforward and promis

16条回答
  •  日久生厌
    2020-11-27 15:18

    Just in case anyone wants to copy and paste a short function here you go - basically running KMeans then finding the minimal matching of points to clusters under the constraint of maximal points assigned to cluster (cluster size)

    from sklearn.cluster import KMeans
    from scipy.spatial.distance import cdist
    from scipy.optimize import linear_sum_assignment
    import numpy as np
    
    
    def get_even_clusters(X, cluster_size):
        n_clusters = int(np.ceil(len(X)/cluster_size))
        kmeans = KMeans(n_clusters)
        kmeans.fit(X)
        centers = kmeans.cluster_centers_
        centers = centers.reshape(-1, 1, X.shape[-1]).repeat(cluster_size, 1).reshape(-1, X.shape[-1])
        distance_matrix = cdist(X, centers)
        clusters = linear_sum_assignment(distance_matrix)[1]//cluster_size
        return clusters
    

提交回复
热议问题