Sklearn Kmeans parameter confusion?

早过忘川 提交于 2019-12-19 07:22:01

问题


So I can run sklearn kmeans as the following:

kmeans = KMeans(n_clusters=3,init='random',n_init=10,max_iter=500)

But I'm a little confused on what the parameters mean

so n_init says:

Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

and max_iter says:

Maximum number of iterations of the k-means algorithm for a single run.

But I don't completely understand what that means. Is n_init the number of times the centroids are moved closer to the points mean, given an initial set of centroids?

And is max_iter the number of times the whole algorithm is run with new initial centroids?

So for example, with max_iter=2,n_init=15, kmeans will choose initial centroids, then move those centroids 15 times and come up with a clustering result. Then kmeans will choose initial centroids again, move those centroids 15 times, and stop. Then, it will pick the best clutering out of the two runs?

Thanks for the help!

[Edit] Or is the the exact opposite of what I have here... ?


回答1:


With max_iter=2 and n_init=15, kmeans will choose initial centroids 15 times and move up to twice on each of the 15 runs.

The default values are n_init=10 and max_iter=300. This means the initial centroids will be chosen 10 times, and each run will use up to 300 iterations. The best out of those 10 runs will be the final result.



来源:https://stackoverflow.com/questions/40895697/sklearn-kmeans-parameter-confusion

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!