is it neccessary to run random forest with cross validation at the same time

问题

Random forest is a robust algorithm. In Random Forest, it trains several small trees and have OOB accuracy. However, is it necessary to run cross-validation with random forest at the same time ?

回答1:

OOB error is an unbiased estimate of the error for random forests, so that's great. But what are you using the cross validation for? If you are comparing the RF against some other algorithm that isn't using bagging in the same way, you want a low variance way to compare them. You have to use cross validation anyway to support the other algorithm. Then using the cross validation sample splits for the RF and the other algorithm is still a good idea, so that you get rid of the variance caused by the split selection.

If you are comparing one RF against another RF with a different feature set, then comparing OOB errors is reasonable. This is especially true if you make sure both RFs use the same bagging sets during training.

回答2:

You do not need to perform any kind of validation. If you just want to use it, and don't care about the risk of overfitting.

For scientific publishing (or anything else, where you compare the quality of different classifiers), you should validate your results, and cross validation is a best practise here.

来源：https://stackoverflow.com/questions/15608721/is-it-neccessary-to-run-random-forest-with-cross-validation-at-the-same-time

标签

machine-learning

classification

random-forest

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!