问题
I have set of points, and i want clusters out of them. I know how to do normal k-means algorithm. But i don't want to take 'k' as input. Suppose if i have points like 1,3,4,50,60,70,1000,10002,10004 the algorithm should cluster them into 3 clusters C1: 1,3,4 C2: 50,60,70 C3: 1000,1002,1004 satisfying distance between intracluster elements should be minimum, and distance between intercluster should be maximum.
回答1:
See how-do-i-determine-k-when-using-k-means-clustering and the links there.
回答2:
Deciding on k is a problem which repeats itself with many clustering algorithms. You might want to consider spectral clustering (and its various algorithmic cousins) which manages to some alleviate that problem. However, many versions use k-means as the final step, returning you to square one (although not all).
Alternatively, there are many approaches for finding the optimal value of k, such as the answer supplied by Denis above; this might be enough for your purposes.
来源:https://stackoverflow.com/questions/5933970/random-clustering-algorithm