What does the parameter 'classwt' in RandomForest function in RandomForest package in R stand for?

后端 未结 1 1763
野趣味
野趣味 2020-12-29 05:49

The help page for randomforest::randomforest() says:

\"classwt - Priors of the classes. Need not add up to one. Ignored for regression.\"

相关标签:
1条回答
  • 2020-12-29 06:07

    could setting classwt parameter help when you have heavy unbalanced data - priors of classes differs strongly?

    Yes, setting values of classwt could be useful for unbalanced datasets. And I agree with joran, that these values are trasformed in probabilities for sampling training data (according Breiman's arguments in his original article).

    How set classwt when in training dataset with 3 classes you have vector of priors equal to (p1,p2,p3), and in test set priors are (q1,q2,q3)?

    For training you can simply specify

    rf <- randomForest(x=x, y=y, classwt=c(p1,p2,p3))
    

    For test set no priors can be used: 1) there is no such option in predict method of randomForest package; 2) weights have only sense for training of the model and not for prediction.

    0 讨论(0)
提交回复
热议问题