Suggestions for speeding up Random Forests

后端 未结 4 1643
执念已碎
执念已碎 2020-12-01 00:46

I\'m doing some work with the randomForest package and while it works well, it can be time-consuming. Any one have any suggestions for speeding things up? I\'

4条回答
  •  鱼传尺愫
    2020-12-01 01:11

    The manual of the foreach package has a section on Parallel Random Forests (Using The foreach Package, Section 5.1):

    > library("foreach")
    > library("doSNOW")
    > registerDoSNOW(makeCluster(4, type="SOCK"))
    
    > x <- matrix(runif(500), 100)
    > y <- gl(2, 50)
    
    > rf <- foreach(ntree = rep(250, 4), .combine = combine, .packages = "randomForest") %dopar%
    +    randomForest(x, y, ntree = ntree)
    > rf
    Call:
    randomForest(x = x, y = y, ntree = ntree)
    Type of random forest: classification
    Number of trees: 1000
    

    If we want want to create a random forest model with a 1000 trees, and our computer has four cores, we can split up the problem into four pieces by executing the randomForest function four times, with the ntree argument set to 250. Of course, we have to combine the resulting randomForest objects, but the randomForest package comes with a function called combine.

提交回复
热议问题