Updating a Decision Tree With New Data

我们两清 提交于 2019-12-04 05:30:54

问题


I am new to decision trees. I am planning to build a large decision tree that I would like to update later with additional data. What is the best approach to this? Can any decision tree be later updated?


回答1:


Decision trees are most often trained on all available data. That is, when you have new data, you retrain the entire tree. Since this process is very fast it is usually not problematic. If data is too big to fit in memory, you can often get around it by subsampling (row sampling) the training set, since tree-based models don't need that much data to give good results.

Note that decision trees are quite vunerable to overfitting, and you should consider Random Forest or another ensemble method. With bagging it is possible to train different trees on different subsets of data.

There also exists incremental and online learning methods for decision trees. CART, ID3 and VFDT learner are some examples.



来源:https://stackoverflow.com/questions/50996637/updating-a-decision-tree-with-new-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!