Do I need to normalize (or scale) data for randomForest (R package)?

后端 未结 6 1089
囚心锁ツ
囚心锁ツ 2020-12-07 08:46

I am doing regression task - do I need to normalize (or scale) data for randomForest (R package)? And is it neccessary to scale also target values? And if - I want to use sc

6条回答
  •  悲哀的现实
    2020-12-07 09:34

    I do not see any suggestions in either the help page or the Vignette that suggests scaling is necessary for a regression variable in randomForest. This example at Stats Exchange does not use scaling either.

    Copy of my comment: The scale function does not belong to pkg:caret. It is part of the "base" R package. There is an unscale function in packages grt and DMwR that will reverse the transformation, or you could simply multiply by the scale attribute and then add the center attribute values.

    Your conception of why "normalization" needs to be done may require critical examination. The test of non-normality is only needed after the regressions are done and may not be needed at all if there are no assumptions of normality in the goodness of fit methodology. So: Why are you asking? Searching in SO and Stats.Exchange might prove useful: citation #1 ; citation #2 ; citation #3

    The boxcox function is a commonly used tranformation when one does not have prior knowledge of twhat a distribution "should" be and when you really need to do a tranformation. There are many pitfalls in applying transformations, so the fact that you need to ask the question raises concerns that you may be in need of further consultations or self-study.

提交回复
热议问题