xgboost | 易学教程

面试——XGBoost

阅读更多关于面试——XGBoost

文章目录简单介绍一下XGBoost XGBoost与GBDT有什么不同 XGBoost为什么使用泰勒二阶展开 XGBoost为什么可以并行训练 XGBoost为什么快 XGBoost防止过拟合的方法 XGBoost如何处理缺失值 XGBoost中的一棵树的停止生长条件 RF和GBDT的区别 XGBoost如何处理不平衡数据比较LR和GBDT，说说什么情景下GBDT不如LR XGBoost中如何对树进行剪枝 XGBoost如何选择最佳分裂点？ XGBoost如何评价特征的重要性 XGBooost参数调优的一般步骤 XGBoost模型如果过拟合了怎么解决为什么XGBoost相比某些模型对缺失值不敏感 XGBoost和LightGBM的区别简单介绍一下XGBoost 首先需要说一说GBDT，它是一种基于boosting增强策略的加法模型，训练的时候采用前向分布算法进行贪婪的学习，每次迭代都学习一棵CART树来拟合之前 t-1 棵树的预测结果与训练样本真实值的残差。 XGBoost对GBDT进行了一系列优化，比如损失函数进行了二阶泰勒展开、目标函数加入正则项、支持并行和默认缺失值处理等，在可扩展性和训练速度上有了巨大的提升，但其核心思想没有大的变化 XGBoost与GBDT有什么不同基分类器：XGBoost的基分类器不仅支持CART决策树，还支持线性分类器

GBDT与XGBOOST

阅读更多关于 GBDT与XGBOOST

Boosting方法实际上是采用加法模型与前向分布算法。在上一篇提到的Adaboost算法也可以用加法模型和前向分布算法来表示。以决策树为基学习器的提升方法称为提升树（Boosting Tree）。对分类问题决策树是CART分类树，对回归问题决策树是CART回归树。 1、前向分布算法　　引入加法模型　　　　在给定了训练数据和损失函数L(y,f(x))L(y,f(x)) 的条件下，可以通过损失函数最小化来学习加法模型　　　　然而对于这个问题是个很复杂的优化问题，而且要训练的参数非常的多，前向分布算法的提出就是为了解决模型的优化问题，其核心思想是因为加法模型是由多各模型相加在一起的，而且在Boosting中模型之间又是有先后顺序的，因此可以在执行每一步加法的时候对模型进行优化，那么每一步只需要学习一个模型和一个参数，通过这种方式来逐步逼近全局最优，每一步优化的损失函数：　　　　具体算法流程如下：　　1）初始化f0(x)=0f0(x)=0；　　2）第m次迭代时，极小化损失函数　　　　3）更新模型，则$f_m (x)$：　　　　4）得到最终的加法模型　　　　 Adaboost算法也可以用前向分布算法来描述，在这里输入的数据集是带有权重分布的数据集，损失函数是指数损失函数。 2、GBDT算法　　GBDT是梯度提升决策树（Gradient Boosting

python xgboost on mac install

阅读更多关于 python xgboost on mac install

问题 I am trying to install xgboost on my Mac for Python 3.4 but I'm getting the following error after "pip3 setup.py install": File "<string>", line 20, in <module> File "/private/var/folders/_x/rkkz7tjj42g9n8lqq5r0ry000000gn/T/pip-build-2dc6bwf7/xgboost/setup.py", line 28, in <module> execfile(libpath_py, libpath, libpath) NameError: name 'execfile' is not defined When running it with the -v option to get the verbose output the error looks like this: Command "python setup.py egg_info" failed

XGBoost crashing kernel in jupyter notebook

阅读更多关于 XGBoost crashing kernel in jupyter notebook

问题 I don't know how to make the XGBoost classifier work. I am running the code below on jupyter notebook, and it always generates this message "The kernel appears to have died. It will restart automatically." from xgboost import XGBClassifier model = XGBClassifier() model.fit(X, y) There is no problem with importing the XGBClassifier, but it crashes upon fitting it to my data. X is a 502 by 33 all-numeric dataframe, y is the set of 0 or 1 labels for each row. Does anyone know what could be the

Which is the loss function for multi-class classification in XGBoost?

阅读更多关于 Which is the loss function for multi-class classification in XGBoost?

问题 I'm trying to know which loss function uses XGBoost for multi-class classification. I found in this question the loss function for logistic classification in the binary case. I had though that for the multi-class case it might be the same as in GBM (for K classes) which can be seen here, where y_k=1 if x's label is k and 0 in any other case, and p_k(x) is the softmax function. However, I have made the first and second order gradient using this loss function and the hessian doesn't match the

Which is the loss function for multi-class classification in XGBoost?

阅读更多关于 Which is the loss function for multi-class classification in XGBoost?

anaconda安装xgboost遇到的一些细节问题

阅读更多关于 anaconda安装xgboost遇到的一些细节问题

如果你直接在anaconda prompt用pip install、conda install 能安装，那么恭喜你。我在安装这个包时运气不好，只能自己下载来安装，结果因为细节问题，浪费了一些安装时间，特意把这些问题记下来。一、下载包的地址 https://www.lfd.uci.edu/~gohlke/pythonlibs/#xgboost 直接在上面这个网址上下载对应的包即可，如下图，“cp34”、“cp35”这些代表的是你的python版本，根据你自己安装的版本选择即可，“win amd64”代表64位操作系统，同理，“win32”代表32位系统，根据这两个条件选择你需要的包下载即可。二、下载完放哪里？下载完，一般是放在你安装python对应的Scripts文件夹下，如果忘记你这个文件在哪里，可以右键桌面“计算机”--属性--高级系统设置--环境变量--用户变量--path下就有你当时安装anaconda时配置的路径。文件按上述位置放好，打开cmd窗口，输入：pip install 上一步下载的文件名（例如：pip install xgboost-0.90-cp36-cp36m-win_amd64.whl）三、安装过程中可能会遇到哪些问题？ 1、提示文件不存在（but the file does not exist）这个问题是你下载的包放置的位置不对导致的

集成算法之GBDT和xgboost

阅读更多关于集成算法之GBDT和xgboost

大家知道，我们在进行建模时，会求解一个目标函数；目标函数又称代价函数，在机器学习中普遍存在，一般形式为： o b j ( θ ) = L ( θ ) + Ω ( θ ) obj(\theta)=L(\theta)+\Omega(\theta) o b j ( θ ) = L ( θ ) + Ω ( θ ) ；其中： L ( θ ) L(\theta) L ( θ ) 为训练误差，衡量模型在训练集上的表现； Ω ( θ ) \Omega(\theta) Ω ( θ ) 是正则化惩罚，衡量模型的复杂度。训练集误差： L = ∑ i = 1 n l ( y i , y i ^ ) L=\sum_{i=1}^{n}l(y_i,\hat{y_i}) L = ∑ i = 1 n l ( y i , y i ^ ) square loss: l ( y i , y i ^ ) = ( y i − y i ^ ) 2 l(y_i,\hat{y_i})=(y_i-\hat{y_i})^2 l ( y i , y i ^ ) = ( y i − y i ^ ) 2 logistic loss： l ( y i , y i ^ ) = y i l n ( 1 + e − y i ^ ) + ( 1 − y i ) l n ( 1 + e y i ^ ) l(y

How do I free all memory on GPU in XGBoost?

阅读更多关于 How do I free all memory on GPU in XGBoost?

问题 Here is my code: clf = xgb.XGBClassifier( tree_method = 'gpu_hist', gpu_id = 0, n_gpus = 4, random_state = 55, n_jobs = -1 ) clf.set_params(**params) clf.fit(X_train, y_train, **fit_params) I've read the answers on this question and this git issue but neither worked. I tried to delete the booster in this way: clf._Booster.__del__() gc.collect() It deletes the booster but doesn't completely free up GPU memory. I guess it's Dmatrix that is still there but I am not sure. How can I free the whole

Need help installing a specific python package using pip

阅读更多关于 Need help installing a specific python package using pip

问题 I have seen related questions to my question, but those answers didn't work for me. I am trying to install xgboost package, but I got this error: *No files/directories in C:\Users\Fatemeh\AppData\Local\Temp\pip-build-57cpr7io\xgboost\pip-egg-info (from PKG-INFO)* I have tried almost all the options such as --no-cache-dir , --no-clean but got the same error. I would appreciate it if you can help me fix this. I tried installing from Github and tried other methods (using cmd and setup.py scripts

订阅 xgboost