lightgbm

f1_score metric in lightgbm

纵然是瞬间 提交于 2019-12-05 02:30:35
问题 I want to train a lgb model with custom metric : f1_score with weighted average. I went through the advanced examples of lightgbm over here and found the implimentation of custom binary error function. I implemented as similiar functon to return f1_score as shown below. def f1_metric(preds, train_data): labels = train_data.get_label() return 'f1', f1_score(labels, preds, average='weighted'), True I tried to train the model by passing feval parameter as f1_metric as shown below. evals_results

lightGBM笔记(持续更新)

孤人 提交于 2019-12-04 19:23:45
https://github.com/Microsoft/LightGBM https://github.com/Microsoft/LightGBM/tree/master/python-package 官方的python封装 https://github.com/ArdalanM/pyLightGBM 非官方的python封装 这个东西被视为比xgboost更好的GBDT开源实现,同时微软出品。 对比细节可参考 http://blog.csdn.net/chary8088/article/details/54316708 还有人专门写了代码PK https://github.com/tks0123456789/XGBoost_vs_LightGBM windows下安装可以参考 http://blog.csdn.net/testcs_dn/article/details/54375316 http://blog.csdn.net/testcs_dn/article/details/54176824 注意需要安装微软的MPI包和编译DLL 特色: https://github.com/Microsoft/LightGBM/wiki/Features 配置和调参 https://github.com/Microsoft/LightGBM/wiki/Configuration

f1_score metric in lightgbm

自作多情 提交于 2019-12-03 20:22:20
I want to train a lgb model with custom metric : f1_score with weighted average. I went through the advanced examples of lightgbm over here and found the implimentation of custom binary error function. I implemented as similiar functon to return f1_score as shown below. def f1_metric(preds, train_data): labels = train_data.get_label() return 'f1', f1_score(labels, preds, average='weighted'), True I tried to train the model by passing feval parameter as f1_metric as shown below. evals_results = {} bst = lgb.train(params, dtrain, valid_sets= [dvalid], valid_names=['valid'], evals_result=evals

解决 anaconda 报错 ModuleNotFoundError: No module named 'lightgbm'

安稳与你 提交于 2019-12-03 03:59:34
通过 pip install lightgbm 安装lightgbm成功, import lightgbm 报错: ModuleNotFoundError: No module named ‘lightgbm’ 原因: lightgbm默认安装在本地python环境中,而anaconda的python路径与本地路径不同,不能使用本地环境中的包,因此无法在anaconda jupyter notebook导入lightgbm包。 解决方法: conda install -c conda-forge lightgbm 将本地环境中的 lightgbm 包拷贝到 anaconda 的 python 环境中 来源: CSDN 作者: Jaichg 链接: https://blog.csdn.net/Jiaach/article/details/83068416

Python: LightGBM cross validation. How to use lightgbm.cv for regression?

匿名 (未验证) 提交于 2019-12-03 00:56:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I want to do a cross validation for LightGBM model with lgb.Dataset and use early_stopping_rounds . The following approach works without a problem with XGBoost's xgboost.cv . I prefer not to use Scikit Learn's approach with GridSearchCV, because it doesn't support early stopping or lgb.Dataset. import lightgbm as lgb from sklearn.metrics import mean_absolute_error dftrainLGB = lgb.Dataset(data = dftrain, label = ytrain, feature_name = list(dftrain)) params = {'objective': 'regression'} cv_results = lgb.cv( params, dftrainLGB, num_boost_round

Lightgbm OSError, Library not loaded

匿名 (未验证) 提交于 2019-12-03 00:52:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: If I simply do: import lightgbm as lgb I'm getting python script.py Traceback (most recent call last): File "script.py", line 4, in <module> import lightgbm as lgb File "/usr/local/lib/python2.7/site-packages/lightgbm/__init__.py", line 8, in <module> from .basic import Booster, Dataset File "/usr/local/lib/python2.7/site-packages/lightgbm/basic.py", line 31, in <module> _LIB = _load_lib() File "/usr/local/lib/python2.7/site-packages/lightgbm/basic.py", line 26, in _load_lib lib = ctypes.cdll.LoadLibrary(lib_path[0]) File "/usr/local/Cellar

XGBoost、LightGBM参数讲解及实战

匿名 (未验证) 提交于 2019-12-03 00:34:01
XGBoost xgboost.XGBClassifier booster=’gbtree’ 使用的提升数的种类 gbtree, gblinear or dart silent=True: 训练过程中是否打印日志 n_jobs=1: 并行运行的多线程数 learning_rate=0.1: 训练的学习率,和梯度下降差不多 max_depth=3: 树的最大深度 gamma=0 n_estimators=100: 要拟合的树的棵树,可以认为是训练轮数 min_child_weight=1: 叶结点的最小权重 subsample=1: 训练样本的抽样比率,行索引 colsample_bytree=1: 特征的抽样比率,列索引 reg_alpha=0: L1正则化系数 reg_lambda=1: L2正则化系数 objective=’binary:logistic’ 确定学习任务和相应的学习函数 "reg:linear" -线性回归 "reg:logistic" -逻辑回归 "binary:logistic" -二分类逻辑回归,输出概率 "binary:logitraw" -二分类逻辑回归,输出未logistic变换前的得分 "multi:softmax" "multi:softprob" random_state=0: 随机种子数 missing=None: 缺失值处理办法 max

最简便的lightGBM GPU支持的安装、验证方法

匿名 (未验证) 提交于 2019-12-03 00:22:01
以下基于ubuntu 16.04 python 3.6.5安装测试成功 1、安装软件依赖 sudo apt-get install --no-install-recommends git cmake build-essential libboost-dev libboost-system-dev libboost-filesystem-dev 2、安装python库 pip install setuptools wheel numpy scipy scikit-learn -U 3、安装lightGBM-GPU sudo pip3.6 install lightgbm --install-option=--gpu --install-option="--opencl-include-dir=/usr/local/cuda/include/" --install-option="--opencl-library=/usr/local/cuda/lib64/libOpenCL.so" 4、测试 先下载测试文件并且进行文件转化 git clone https://github.com/guolinke/boosting_tree_benchmarks.git cd boosting_tree_benchmarks/data wget "https://archive.ics.uci

python机器学习案例系列教程――LightGBM算法

匿名 (未验证) 提交于 2019-12-02 22:11:45
pip install lightgbm 1 1 gitup网址: https://github.com/Microsoft/LightGBM http://lightgbm.apachecn.org/cn/latest/index.html xgboost的出现,让数据民工们告别了传统的机器学习算法们:RF、GBM、SVM、LASSO……..。现在微软推出了一个新的boosting框架,想要挑战xgboost的江湖地位。 顾名思义,lightGBM包含两个关键点:light即轻量级,GBM 梯度提升机。 LightGBM 是一个梯度 boosting 框架,使用基于学习算法的决策树。它可以说是分布式的,高效的,有以下优势: 更快的训练效率 低内存使用 更高的准确率 支持并行化学习 可处理大规模数据 如果你觉得这篇文章看起来稍微还有些吃力,或者想要系统地学习人工智能,那么推荐你去看床长人工智能教程。非常棒的大神之作,教程不仅通俗易懂,而且很风趣幽默。点击 这里 可以查看教程。 其缺点,或者说不足之处: 每轮迭代时,都需要遍历整个训练数据多次。如果把整个训练数据装进内存则会限制训练数据的大小;如果不装进内存,反复地读写训练数据又会消耗非常大的时间。 预排序方法(pre-sorted):首先,空间消耗大。这样的算法需要保存数据的特征值,还保存了特征排序的结果(例如排序后的索引

19KDD AccuAir Winning Solution to Air Quality Prediction for KDD Cup 2018

筅森魡賤 提交于 2019-12-02 05:05:17
目的:用空气质量、meteorology (气象学)、spatial topology (空间拓扑)、天气预报、站点信息、时间信息来预测空气质量。 难点:影响因素多,参量之间的影响是非线性的且具有时空特性,突变的噪声性质,有未知参量的影响。 解决方案:建立了LightGBM、spatial-temporal gated DNN、Seq2Seq model三个模型,分别用现有数据集训练;再训练一个线性模型将上述三个模型的结果合并起来作为预测的输出。 另外说一下,集成学习(ensemble learning)的方法经常用于各种竞赛中,可以说是刷榜必备。 related work related work介绍了气象学模型,静态学习模型,深度学习模型(基于时间序列)来解决空气质量预测的问题。提出完成此任务要提出融合多种时空信息的策略,这是解决问题的关键。 提出的方法 总体模型架构如下图: LightGBM:特征选择器,比较稳定 spatial-temporal gated DNN:有处理时空响应的能力 Seq2Seq model:编码输入,解码输出 LightGBM LightGBM是基本的baseline,spatial-temporal gated DNN提取时空信息。Seq2Seq model做编解码,能对快速变化的输入产生良好的反应。本文分为了四个步骤训练LightGBM