optuna.pruners.HyperbandPruner
- class optuna.pruners.HyperbandPruner(min_resource=1, max_resource='auto', reduction_factor=3, bootstrap_count=0)[source]
Pruner using Hyperband.
As SuccessiveHalving (SHA) requires the number of configurations \(n\) as its hyperparameter. For a given finite budget \(B\), all the configurations have the resources of \(B \over n\) on average. As you can see, there will be a trade-off of \(B\) and \(B \over n\). Hyperband attacks this trade-off by trying different \(n\) values for a fixed budget.
Note
In the Hyperband paper, the counterpart of
RandomSampler
is used.Optuna uses
TPESampler
by default.The benchmark result shows that
optuna.pruners.HyperbandPruner
supports both samplers.
Note
If you use
HyperbandPruner
withTPESampler
, it’s recommended to consider setting largern_trials
ortimeout
to make full use of the characteristics ofTPESampler
becauseTPESampler
uses some (by default, \(10\))Trial
s for its startup.As Hyperband runs multiple
SuccessiveHalvingPruner
and collects trials based on the currentTrial
‘s bracket ID, each bracket needs to observe more than \(10\)Trial
s forTPESampler
to adapt its search space.Thus, for example, if
HyperbandPruner
has \(4\) pruners in it, at least \(4 \times 10\) trials are consumed for startup.Note
Hyperband has several
SuccessiveHalvingPruner
s. EachSuccessiveHalvingPruner
is referred to as “bracket” in the original paper. The number of brackets is an important factor to control the early stopping behavior of Hyperband and is automatically determined bymin_resource
,max_resource
andreduction_factor
as \(\mathrm{The\ number\ of\ brackets} = \mathrm{floor}(\log_{\texttt{reduction}\_\texttt{factor}} (\frac{\texttt{max}\_\texttt{resource}}{\texttt{min}\_\texttt{resource}})) + 1\). Please setreduction_factor
so that the number of brackets is not too large (about 4 – 6 in most use cases). Please see Section 3.6 of the original paper for the detail.Example
We minimize an objective function with Hyperband pruning algorithm.
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_valid, y_train, y_valid = train_test_split(X, y) classes = np.unique(y) n_train_iter = 100 def objective(trial): alpha = trial.suggest_float("alpha", 0.0, 1.0) clf = SGDClassifier(alpha=alpha) for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study( direction="maximize", pruner=optuna.pruners.HyperbandPruner( min_resource=1, max_resource=n_train_iter, reduction_factor=3 ), ) study.optimize(objective, n_trials=20)
- Parameters:
min_resource (int) – A parameter for specifying the minimum resource allocated to a trial noted as \(r\) in the paper. A smaller \(r\) will give a result faster, but a larger \(r\) will give a better guarantee of successful judging between configurations. See the details for
SuccessiveHalvingPruner
.A parameter for specifying the maximum resource allocated to a trial. \(R\) in the paper corresponds to
max_resource / min_resource
. This value represents and should match the maximum iteration steps (e.g., the number of epochs for neural networks). When this argument is “auto”, the maximum resource is estimated according to the completed trials. The default value of this argument is “auto”.Note
With “auto”, the maximum resource will be the largest step reported by
report()
in the first, or one of the first if trained in parallel, completed trial. No trials will be pruned until the maximum resource is determined.Note
If the step of the last intermediate value may change with each trial, please manually specify the maximum possible step to
max_resource
.reduction_factor (int) – A parameter for specifying reduction factor of promotable trials noted as \(\eta\) in the paper. See the details for
SuccessiveHalvingPruner
.bootstrap_count (int) – Parameter specifying the number of trials required in a rung before any trial can be promoted. Incompatible with
max_resource
is"auto"
. See the details forSuccessiveHalvingPruner
.
Methods
prune
(study, trial)Judge whether the trial should be pruned based on the reported values.
- prune(study, trial)[source]
Judge whether the trial should be pruned based on the reported values.
Note that this method is not supposed to be called by library users. Instead,
optuna.trial.Trial.report()
andoptuna.trial.Trial.should_prune()
provide user interfaces to implement pruning mechanism in an objective function.- Parameters:
study (Study) – Study object of the target study.
trial (FrozenTrial) – FrozenTrial object of the target trial. Take a copy before modifying this object.
- Returns:
A boolean value representing whether the trial should be pruned.
- Return type: