optuna.pruners.HyperbandPruner

class optuna.pruners.HyperbandPruner(min_resource=1, max_resource='auto', reduction_factor=3, bootstrap_count=0)[source]

Pruner using Hyperband.

As SuccessiveHalving (SHA) requires the number of configurations \(n\) as its hyperparameter. For a given finite budget \(B\), all the configurations have the resources of \(B \over n\) on average. As you can see, there will be a trade-off of \(B\) and \(B \over n\). Hyperband attacks this trade-off by trying different \(n\) values for a fixed budget.

Note

In the Hyperband paper, the counterpart of RandomSampler is used.
Optuna uses TPESampler by default.
The benchmark result shows that optuna.pruners.HyperbandPruner supports both samplers.

Note

If you use HyperbandPruner with TPESampler, it’s recommended to consider to set larger n_trials or timeout to make full use of the characteristics of TPESampler because TPESampler uses some (by default, \(10\)) Trials for its startup.

As Hyperband runs multiple SuccessiveHalvingPruner and collect trials based on the current Trial‘s bracket ID, each bracket needs to observe more than \(10\) Trials for TPESampler to adapt its search space.

Thus, for example, if HyperbandPruner has \(4\) pruners in it, at least \(4 \times 10\) trials are consumed for startup.

Note

Hyperband has several SuccessiveHalvingPruner. Each SuccessiveHalvingPruner is referred as “bracket” in the original paper. The number of brackets is an important factor to control the early stopping behavior of Hyperband and is automatically determined by min_resource, max_resource and reduction_factor as The number of brackets = floor(log_{reduction_factor}(max_resource / min_resource)) + 1. Please set reduction_factor so that the number of brackets is not too large (about 4 ~ 6 in most use cases). Please see Section 3.6 of the original paper for the detail.

See also

Please refer to report().

Example

We minimize an objective function with Hyperband pruning algorithm.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_valid, y_train, y_valid = train_test_split(X, y)
classes = np.unique(y)
n_train_iter = 100


def objective(trial):
    alpha = trial.suggest_float("alpha", 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)


study = optuna.create_study(
    direction="maximize",
    pruner=optuna.pruners.HyperbandPruner(
        min_resource=1, max_resource=n_train_iter, reduction_factor=3
    ),
)
study.optimize(objective, n_trials=20)

Parameters

min_resource (int) – A parameter for specifying the minimum resource allocated to a trial noted as \(r\) in the paper. A smaller \(r\) will give a result faster, but a larger \(r\) will give a better guarantee of successful judging between configurations. See the details for SuccessiveHalvingPruner.
max_resource (Union[str, int]) –
A parameter for specifying the maximum resource allocated to a trial. \(R\) in the paper corresponds to max_resource / min_resource. This value represents and should match the maximum iteration steps (e.g., the number of epochs for neural networks). When this argument is “auto”, the maximum resource is estimated according to the completed trials. The default value of this argument is “auto”.

Note

With “auto”, the maximum resource will be the largest step reported by report() in the first, or one of the first if trained in parallel, completed trial. No trials will be pruned until the maximum resource is determined.

Note

If the step of the last intermediate value may change with each trial, please manually specify the maximum possible step to max_resource.
reduction_factor (int) – A parameter for specifying reduction factor of promotable trials noted as \(\eta\) in the paper. See the details for SuccessiveHalvingPruner.
bootstrap_count (int) – Parameter specifying the number of trials required in a rung before any trial can be promoted. Incompatible with max_resouce is "auto". See the details for SuccessiveHalvingPruner.

Methods

prune(study, trial)

Judge whether the trial should be pruned based on the reported values.

prune(study, trial)[source]

Judge whether the trial should be pruned based on the reported values.

Note that this method is not supposed to be called by library users. Instead, optuna.trial.Trial.report() and optuna.trial.Trial.should_prune() provide user interfaces to implement pruning mechanism in an objective function.

Parameters

study (Study) – Study object of the target study.
trial (FrozenTrial) – FrozenTrial object of the target trial. Take a copy before modifying this object.

Returns

A boolean value representing whether the trial should be pruned.

Return type

bool