发表新帖

发表新帖

How does the class_weight parameter in scikit-learn work?

后端未结

关注

 2  1805

[愿得一人] 2020-11-29 15:23

I am having a lot of trouble understanding how the class_weight parameter in scikit-learn\'s Logistic Regression operates.

The Situation

2条回答

予麋鹿 (楼主)

2020-11-29 15:41

First off, it might not be good to just go by recall alone. You can simply achieve a recall of 100% by classifying everything as the positive class. I usually suggest using AUC for selecting parameters, and then finding a threshold for the operating point (say a given precision level) that you are interested in.

For how class_weight works: It penalizes mistakes in samples of class[i] with class_weight[i] instead of 1. So higher class-weight means you want to put more emphasis on a class. From what you say it seems class 0 is 19 times more frequent than class 1. So you should increase the class_weight of class 1 relative to class 0, say {0:.1, 1:.9}. If the class_weight doesn't sum to 1, it will basically change the regularization parameter.

For how class_weight="auto" works, you can have a look at this discussion. In the dev version you can use class_weight="balanced", which is easier to understand: it basically means replicating the smaller class until you have as many samples as in the larger one, but in an implicit way.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题