xgboost

Xgboost work on pycharm but not in Jupyter NoteBook

巧了我就是萌 提交于 2019-12-08 03:30:19
问题 I've successfully installed Xgboost in windows with Pycharm Python, and it is working. However, in Jupyter NoteBook, it is not working. import xgboost as xgb ---> 12 import xgboost as xgb ModuleNotFoundError: No module named 'xgboost' In Jupyter the xgboost package is at: > !pip install xgboost Requirement already satisfied: xgboost in c:\users\sifangyou\anaconda3\lib\site-packages\xgboost-0.6-py3.6.egg Requirement already satisfied: numpy in c:\users\sifangyou\anaconda3\lib\site-packages

XGBoost - H2O crashed due to an illegal memory access

此生再无相见时 提交于 2019-12-07 13:31:43
问题 H2O process crashed when doing a Grid Search with XGBoost: terminate called after throwing an instance of 'thrust::system::system_error' what(): /tmp/xgboost/plugin/updater_gpu/src/device_helpers.cuh(387): an illegal memory access was encountered After giving the INFO message below: 08-17 06:44:46.672 10.0.1.89:54321 14426 FJ-1-3 INFO: Checking convergence with logloss metric: 0.04519170911104479 --> 0.02811784326194906 (still improving) . 08-17 06:44:46.672 10.0.1.89:54321 14426 FJ-1-3 INFO:

How to install XGBoost on OSX with multi-threading

邮差的信 提交于 2019-12-07 10:03:59
问题 I'm trying to install xgboost on my mac (osx 10.12.1) following the guide here but I'm running into some issues. Step1 Obtain gcc-6.x.x with openmp support by brew install gcc --without-multilib Terminal Ben$ brew install gcc --without-multilib Error: gcc-5.3.0 already installed To install this version, first `brew unlink gcc` Ben$ brew unlink gcc Unlinking /usr/local/Cellar/gcc/5.3.0... 1288 symlinks removed Ben$ brew install gcc --without-multilib [26 minutes later] ==> Summary 🍺 /usr/local

xgboost xgb.dump tree coefficient

*爱你&永不变心* 提交于 2019-12-07 09:43:40
问题 I have a sample code here. data(agaricus.train, package='xgboost') train <- agaricus.train bst <- xgboost(data = train$data, label = train$label, max.depth = 2, eta = 1, nthread = 2, nround = 2,objective = "binary:logistic") xgb.dump(bst, 'xgb.model.dump', with.stats = TRUE) After building the model, I print it out as booster[0] 0:[f28<-1.00136e-05] yes=1,no=2,missing=1,gain=4000.53,cover=1628.25 1:[f55<-1.00136e-05] yes=3,no=4,missing=3,gain=1158.21,cover=924.5 3:leaf=1.71218,cover=812 4

BAT机器学习面试1000题系列

本小妞迷上赌 提交于 2019-12-06 14:35:20
几点声明: 1、本文的内容全部来源于七月在线发布的BAT机器学习面试1000题系列; 2、文章中带斜体的文字代表是本人自己增加的内容,如有错误还请批评指正; 3、原文中有部分链接已经失效,故而本人重新加上了新的链接,如有不当,还请指正。(也已用斜体标出) 4、部分答案由于完全是摘抄自其它的博客,所以本人就只贴出答案链接,这样既可以节省版面,也可以使排版更加美观。点击对应的问题即可跳转。 最后,此博文的排版已经经过本人整理,公式已用latex语法表示,方便读者阅读。同时链接形式也做了优化,可直接跳转至相应页面,希望能够帮助读者提高阅读体验,文中如果因为本人的整理出现纰漏,还请指出,大家共同进步! 1.请简要介绍下SVM。 SVM,全称是support vector machine,中文名叫支持向量机。SVM是一个面向数据的分类算法,它的目标是为确定一个分类超平面,从而将不同的数据分隔开。 扩展: 支持向量机学习方法包括构建由简至繁的模型:线性可分支持向量机、线性支持向量机及非线性支持向量机。当训练数据线性可分时,通过硬间隔最大化,学习一个线性的分类器,即线性可分支持向量机,又称为硬间隔支持向量机;当训练数据近似线性可分时,通过软间隔最大化,也学习一个线性的分类器,即线性支持向量机,又称为软间隔支持向量机;当训练数据线性不可分时,通过使用核技巧及软间隔最大化,学习非线性支持向量机。

Can someone explain how these scores are derived in this XGBoost trees?

烂漫一生 提交于 2019-12-06 13:31:51
I am looking at the below image. Can someone explain how they are calculated? I though it was -1 for an N and +1 for a yes but then I can't figure out how the little girl has .1. But that doesn't work for tree 2 either. The values of leaf elements (aka "scores") - +2 , +0.1 , -1 , +0.9 and -0.9 - were devised by the XGBoost algorithm during training. In this case, the XGBoost model was trained using a dataset where little boys ( +2 ) appear somehow "greater" than little girls ( +0.1 ). If you knew what the response variable was, then you could probably interpret/rationalize those contributions

使用Anaconda3的Docker镜像

无人久伴 提交于 2019-12-06 07:00:45
假设本地 Ubuntu 服务器已经安装好了Docker,这里讲述一下如何开始运行Anaconda3的Docker镜像: 1. 搜索镜像 搜索我们想要的anaconda镜像: docker search anaconda 2. 拉取镜像 我们决定拉anaconda3官方镜像,即 continuumio/anaconda3 这个镜像: docker pull continuumio/anaconda3 注意,这个镜像大小接近1GB,所以时间比较长。 3.运行镜像,指定网络端口 运行 anaconda3 镜像的bash命令行,其中指定容器到宿主机的端口映射: docker run -i -t -p 12345:8888 continuumio/anaconda3 /bin/bash 其中: -i: 是 以交互模式运行容器,通常与 -t 同时使用; -t: 为容器重新分配一个伪输入终端,通常与 -i 同时使用; -p: 指定端口映射,格式为:主机(宿主)端口:容器端口 具体数字随便写的... 即可进入anaconda3的命令行。 4. 检查Python的版本 python 当前是3.7.3版本 5. 查看已经安装的库 有两种查看方法,pip 和 conda 均可 conda list pip list 6. 安装xgboost(或者其他包) 首先,原始镜像应该是不带xgboost的

XGBoost CV and best iteration

此生再无相见时 提交于 2019-12-06 06:48:48
问题 I am using XGBoost cv to find the optimal number of rounds for my model. I would be very grateful if someone could confirm (or refute), the optimal number of rounds is: estop = 40 res = xgb.cv(params, dvisibletrain, num_boost_round=1000000000, nfold=5, early_stopping_rounds=estop, seed=SEED, stratified=True) best_nrounds = res.shape[0] - estop best_nrounds = int(best_nrounds / 0.8) i.e: the total number of rounds completed is res.shape[0], so to get the optimal number of rounds, we subtract

Parallel threading with xgboost?

陌路散爱 提交于 2019-12-06 06:44:02
问题 According to its documentation, xgboost has an n_jobs parameter. However, when I attempt to set n_jobs, I get this error: TypeError: __init__() got an unexpected keyword argument 'n_jobs' Same issue for some other parameters like random_state. I assumed this might be an update issue, but it seems I have the latest version (0.6a2, installed with pip). There isn't much needed for me to reproduce the error: from xgboost import XGBClassifier estimator_xGBM = XGBClassifier(max_depth = 5, learning

How to obtain a confidence interval or a measure of prediction dispersion when using xgboost for classification?

半腔热情 提交于 2019-12-06 06:36:23
问题 How to obtain a confidence interval or a measure of prediction dispersion when using xgboost for classification? So for example, if xgboost predicts a probability of an event is 0.9, how can the confidence in that probability be obtained? Also is this confidence assumed to be heteroskedastic? 回答1: To produce confidence intervals for xgboost model you should train several models (you can use bagging for this). Each model will produce a response for test sample - all responses will form a