问题
Is it possible to define class weights for a K-nearest neighbour classifier in SKLearn? I have looked at the API but cannot work it out. I have a knn problem which has very imbalanced numbers of classes (10000 of some, to 1 of others).
回答1:
The original knn in sklearn does not seem to offer that option. You can alter the source code though by adding coefficients (weights) to the distance equation such that the distance is amplified for records belonging to the majority class (e.g., with a coefficient of 1.5).
https://github.com/scikit-learn/scikit-learn/blob/7b136e9/sklearn/neighbors/classification.py#L23
Alternatively, the imbalanced-learn module, which is part of scikit-learn-contrib projects, can be used for data sets with high between-class imbalance:
http://contrib.scikit-learn.org/imbalanced-learn/stable/introduction.html
(in case of binary classification, you may alternatively treat the problem as an unsupervised outlier detection problem, and use methods like one-class SVM in sklearn to perform the classification)
来源:https://stackoverflow.com/questions/37876280/knn-with-class-weights-in-sklearn