xgboost: which parameters are used in the linear booster gblinear?

问题

Looking on the web I am still a confused about what the linear booster gblinear precisely is and I am not alone.

Following the documentation it only has 3 parameters lambda,lambda_bias and alpha - maybe it should say "additional parameters".

If I understand this correctly then the linear booster does (rather standard) linear boosting (with regularization). In this context I can only make sense of the 3 parameters above and eta (the boosting rate). That's also how it is described on github.

Nevertheless I see that tree parameters gamma,max_depth and min_child_weight also have an impact on the algorithm.

How can this be? Is there a totally clear description of the linear booster anywhere on the web?

See my examples:

library(xgboost)

data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test

Then the setup

set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5, 
                 objective = "binary:logistic", 
                 params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1,gamma = 2,
                               early_stopping_rounds = 3))

gives

> [1]   train-error:0.018271  [2]   train-error:0.003071 
> [3]   train-error:0.001075  [4]   train-error:0.001075 
> [5]   train-error:0.000614

while gamma=1

set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5, 
                 objective = "binary:logistic", 
                 params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1,gamma = 1,
                               early_stopping_rounds = 3))

leads to

> [1]   train-error:0.013051  [2]   train-error:0.001842 
> [3]   train-error:0.001075  [4]   train-error:0.001075 
> [5]   train-error:0.001075

which is another "path".

Similar for max_depth:

set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5, 
                 objective = "binary:logistic", 
                 params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1, max_depth = 3,
                               early_stopping_rounds = 3))

> [1]   train-error:0.016122  [2]   train-error:0.002764 
> [3]   train-error:0.001075  [4]   train-error:0.001075 
> [5]   train-error:0.000768

and

set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 10, 
                 objective = "binary:logistic", 
                 params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1, max_depth = 4,
                               early_stopping_rounds = 3))

> [1]   train-error:0.014740  [2]   train-error:0.004453 
> [3]   train-error:0.001228  [4]   train-error:0.000921 
> [5]   train-error:0.000614

回答1:

I may as well do some squats between running the gblinear, observe the results change almost every time, and claim that doing squats has an impact on the algorithm :)

In all seriousness, the algorithm that gblinear currently uses is not your "rather standard linear boosting". The thing responsible for the stochasticity is the use of lock-free parallelization ('hogwild') while updating the gradients during each iteration. Setting the seed doesn't affect anything; and you would only get consistently reproducible results when running single-threaded (nthread=1). I would also advise against running it with the default nthread setting which uses maximum possible number of OpenMP threads, as on many systems it would result in much slower speed due to thread congestion. The nthread needs to be not higher than the number of physical cores.

This free stochasticity might improve predictive performance in some situations. However, the pros frequently don't outweigh cons. At some point, I will submit a pull request with an option for deterministic parallelization and an option for some additional control over feature selection at each boosting round.

For ground truth on all the available parameters that are specific to booster training, refer to the source of struct GBLinearTrainParam for gblinear and to the source of struct TrainParam for gbtree.

来源：https://stackoverflow.com/questions/42783098/xgboost-which-parameters-are-used-in-the-linear-booster-gblinear

标签

boost

classification

xgboost