optuna.integration.lightgbm.LightGBMTuner

class optuna.integration.lightgbm.LightGBMTuner(params, train_set, num_boost_round=1000, valid_sets=None, valid_names=None, fobj=None, feval=None, feature_name='auto', categorical_feature='auto', early_stopping_rounds=None, evals_result=None, verbose_eval='warn', learning_rates=None, keep_training_booster=False, callbacks=None, time_budget=None, sample_size=None, study=None, optuna_callbacks=None, model_dir=None, verbosity=None, show_progress_bar=True, *, optuna_seed=None)[source]

Hyperparameter tuner for LightGBM.

It optimizes the following hyperparameters in a stepwise manner: lambda_l1, lambda_l2, num_leaves, feature_fraction, bagging_fraction, bagging_freq and min_child_samples.

You can find the details of the algorithm and benchmark results in this blog article by Kohei Ozaki, a Kaggle Grandmaster.

Arguments and keyword arguments for lightgbm.train() can be passed. The arguments that only LightGBMTuner has are listed below:

Parameters
  • time_budget (Optional[int]) – A time budget for parameter tuning in seconds.

  • study (Optional[Study]) – A Study instance to store optimization results. The Trial instances in it has the following user attributes: elapsed_secs is the elapsed time since the optimization starts. average_iteration_time is the average time of iteration to train the booster model in the trial. lgbm_params is a JSON-serialized dictionary of LightGBM parameters used in the trial.

  • optuna_callbacks (Optional[List[Callable[[Study, FrozenTrial], None]]]) – List of Optuna callback functions that are invoked at the end of each trial. Each function must accept two parameters with the following types in this order: Study and FrozenTrial. Please note that this is not a callbacks argument of lightgbm.train() .

  • model_dir (Optional[str]) – A directory to save boosters. By default, it is set to None and no boosters are saved. Please set shared directory (e.g., directories on NFS) if you want to access get_best_booster() in distributed environments. Otherwise, it may raise ValueError. If the directory does not exist, it will be created. The filenames of the boosters will be {model_dir}/{trial_number}.pkl (e.g., ./boosters/0.pkl).

  • verbosity (Optional[int]) –

    A verbosity level to change Optuna’s logging level. The level is aligned to LightGBM’s verbosity .

    Warning

    Deprecated in v2.0.0. verbosity argument will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.

    Please use set_verbosity() instead.

  • show_progress_bar (bool) –

    Flag to show progress bars or not. To disable progress bar, set this False.

    Note

    Progress bars will be fragmented by logging messages of LightGBM and Optuna. Please suppress such messages to show the progress bars properly.

  • optuna_seed (Optional[int]) –

    seed of TPESampler for random number generator that affects sampling for num_leaves, bagging_fraction, bagging_freq, lambda_l1, and lambda_l2.

    Note

    The deterministic parameter of LightGBM makes training reproducible. Please enable it when you use this argument.

  • params (Dict[str, Any]) –

  • train_set (lgb.Dataset) –

  • num_boost_round (int) –

  • valid_sets (Optional[VALID_SET_TYPE]) –

  • valid_names (Optional[Any]) –

  • fobj (Optional[Callable[[...], Any]]) –

  • feval (Optional[Callable[[...], Any]]) –

  • feature_name (str) –

  • categorical_feature (str) –

  • early_stopping_rounds (Optional[int]) –

  • evals_result (Optional[Dict[Any, Any]]) –

  • verbose_eval (Optional[Union[bool, int, str]]) –

  • learning_rates (Optional[List[float]]) –

  • keep_training_booster (bool) –

  • callbacks (Optional[List[Callable[[...], Any]]]) –

  • sample_size (Optional[int]) –

Methods

compare_validation_metrics(val_score, best_score)

get_best_booster()

Return the best booster.

higher_is_better()

run()

Perform the hyperparameter-tuning with given parameters.

sample_train_set()

Make subset of self.train_set Dataset object.

tune_bagging([n_trials])

tune_feature_fraction([n_trials])

tune_feature_fraction_stage2([n_trials])

tune_min_data_in_leaf()

tune_num_leaves([n_trials])

tune_regularization_factors([n_trials])

Attributes

best_params

Return parameters of the best booster.

best_score

Return the score of the best booster.

property best_params: Dict[str, Any]

Return parameters of the best booster.

property best_score: float

Return the score of the best booster.

get_best_booster()

Return the best booster.

If the best booster cannot be found, ValueError will be raised. To prevent the errors, please save boosters by specifying the model_dir argument of __init__(), when you resume tuning or you run tuning in parallel.

Return type

Booster

run()

Perform the hyperparameter-tuning with given parameters.

Return type

None

sample_train_set()

Make subset of self.train_set Dataset object.

Return type

None