问题
When I use scikit-learn's implementation of k-means I usually just call the fit()
method and that is enough to get the cluster centers and the labels. The predict()
method is used to calculate the labels and even a fit_predict()
method is available for convenience, but if I can get the labels only using fit()
, what is the purpose of the predict()
method?
回答1:
predict
, as @EdChum suggested, can be used on unseen data. This (and more so, the transform
method) is useful when k-means is used for feature extraction in semisupervised learning: you cluster a large set of samples, then use nearest centroid/distance to centroids as features for a subsequent supervised learning problem. When using the result for prediction, you get samples that were not seen by k-means.
来源:https://stackoverflow.com/questions/25012342/scikit-learns-k-means-what-does-the-predict-method-really-do