Pruners¶

class
optuna.pruners.
BasePruner
[source]¶ Base class for pruners.

abstract
prune
(study, trial)[source]¶ Judge whether the trial should be pruned based on the reported values.
Note that this method is not supposed to be called by library users. Instead,
optuna.trial.Trial.report()
andoptuna.trial.Trial.should_prune()
provide user interfaces to implement pruning mechanism in an objective function. Parameters
study – Study object of the target study.
trial – FrozenTrial object of the target trial. Take a copy before modifying this object.
 Returns
A boolean value representing whether the trial should be pruned.

abstract

class
optuna.pruners.
MedianPruner
(n_startup_trials=5, n_warmup_steps=0, interval_steps=1)[source]¶ Pruner using the median stopping rule.
Prune if the trial’s best intermediate result is worse than median of intermediate results of previous trials at the same step.
Example
We minimize an objective function with the median stopping rule.
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_valid, y_train, y_valid = train_test_split(X, y) classes = np.unique(y) def objective(trial): alpha = trial.suggest_uniform('alpha', 0.0, 1.0) clf = SGDClassifier(alpha=alpha) n_train_iter = 100 for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study(direction='maximize', pruner=optuna.pruners.MedianPruner(n_startup_trials=5, n_warmup_steps=30, interval_steps=10)) study.optimize(objective, n_trials=20)
 Parameters
n_startup_trials – Pruning is disabled until the given number of trials finish in the same study.
n_warmup_steps – Pruning is disabled until the trial exceeds the given number of step.
interval_steps – Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported.

class
optuna.pruners.
NopPruner
[source]¶ Pruner which never prunes trials.
Example
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_valid, y_train, y_valid = train_test_split(X, y) classes = np.unique(y) def objective(trial): alpha = trial.suggest_uniform('alpha', 0.0, 1.0) clf = SGDClassifier(alpha=alpha) n_train_iter = 100 for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): assert False, "should_prune() should always return False with this pruner." raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study(direction='maximize', pruner=optuna.pruners.NopPruner()) study.optimize(objective, n_trials=20)

class
optuna.pruners.
PercentilePruner
(percentile, n_startup_trials=5, n_warmup_steps=0, interval_steps=1)[source]¶ Pruner to keep the specified percentile of the trials.
Prune if the best intermediate value is in the bottom percentile among trials at the same step.
Example
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_valid, y_train, y_valid = train_test_split(X, y) classes = np.unique(y) def objective(trial): alpha = trial.suggest_uniform('alpha', 0.0, 1.0) clf = SGDClassifier(alpha=alpha) n_train_iter = 100 for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study( direction='maximize', pruner=optuna.pruners.PercentilePruner(25.0, n_startup_trials=5, n_warmup_steps=30, interval_steps=10)) study.optimize(objective, n_trials=20)
 Parameters
percentile – Percentile which must be between 0 and 100 inclusive (e.g., When given 25.0, top of 25th percentile trials are kept).
n_startup_trials – Pruning is disabled until the given number of trials finish in the same study.
n_warmup_steps – Pruning is disabled until the trial exceeds the given number of step.
interval_steps – Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported. Value must be at least 1.

class
optuna.pruners.
SuccessiveHalvingPruner
(min_resource='auto', reduction_factor=4, min_early_stopping_rate=0)[source]¶ Pruner using Asynchronous Successive Halving Algorithm.
Successive Halving is a banditbased algorithm to identify the best one among multiple configurations. This class implements an asynchronous version of Successive Halving. Please refer to the paper of Asynchronous Successive Halving for detailed descriptions.
Note that, this class does not take care of the parameter for the maximum resource, referred to as \(R\) in the paper. The maximum resource allocated to a trial is typically limited inside the objective function (e.g.,
step
number in simple.py,EPOCH
number in chainer_integration.py).Example
We minimize an objective function with
SuccessiveHalvingPruner
.import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_valid, y_train, y_valid = train_test_split(X, y) classes = np.unique(y) def objective(trial): alpha = trial.suggest_uniform('alpha', 0.0, 1.0) clf = SGDClassifier(alpha=alpha) n_train_iter = 100 for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study(direction='maximize', pruner=optuna.pruners.SuccessiveHalvingPruner()) study.optimize(objective, n_trials=20)
 Parameters
min_resource –
A parameter for specifying the minimum resource allocated to a trial (in the paper this parameter is referred to as \(r\)). This parameter defaults to ‘auto’ where the value is determined based on a heuristic that looks at the number of required steps for the first trial to complete.
A trial is never pruned until it executes \(\mathsf{min}\_\mathsf{resource} \times \mathsf{reduction}\_\mathsf{factor}^{ \mathsf{min}\_\mathsf{early}\_\mathsf{stopping}\_\mathsf{rate}}\) steps (i.e., the completion point of the first rung). When the trial completes the first rung, it will be promoted to the next rung only if the value of the trial is placed in the top \({1 \over \mathsf{reduction}\_\mathsf{factor}}\) fraction of the all trials that already have reached the point (otherwise it will be pruned there). If the trial won the competition, it runs until the next completion point (i.e., \(\mathsf{min}\_\mathsf{resource} \times \mathsf{reduction}\_\mathsf{factor}^{ (\mathsf{min}\_\mathsf{early}\_\mathsf{stopping}\_\mathsf{rate} + \mathsf{rung})}\) steps) and repeats the same procedure.
Note
If the step of the last intermediate value may change with each trial, please manually specify the minimum possible step to
min_resource
.reduction_factor –
A parameter for specifying reduction factor of promotable trials (in the paper this parameter is referred to as \(\eta\)). At the completion point of each rung, about \({1 \over \mathsf{reduction}\_\mathsf{factor}}\) trials will be promoted.
min_early_stopping_rate –
A parameter for specifying the minimum earlystopping rate (in the paper this parameter is referred to as \(s\)).

class
optuna.pruners.
HyperbandPruner
(min_resource: int = 1, max_resource: Union[str, int] = 'auto', reduction_factor: int = 3, n_brackets: Optional[int] = None, min_early_stopping_rate_low: Optional[int] = None)[source]¶ Pruner using Hyperband.
As SuccessiveHalving (SHA) requires the number of configurations \(n\) as its hyperparameter. For a given finite budget \(B\), all the configurations have the resources of \(B \over n\) on average. As you can see, there will be a tradeoff of \(B\) and \(B \over n\). Hyperband attacks this tradeoff by trying different \(n\) values for a fixed budget.
Note
In the Hyperband paper, the counterpart of
RandomSampler
is used.Optuna uses
TPESampler
by default.The benchmark result shows that
optuna.pruners.HyperbandPruner
supports both samplers.
Note
If you use
HyperbandPruner
withTPESampler
, it’s recommended to consider to set largern_trials
ortimeout
to make full use of the characteristics ofTPESampler
becauseTPESampler
uses some (by default, \(10\))Trial
s for its startup.As Hyperband runs multiple
SuccessiveHalvingPruner
and collect trials based on the currentTrial
’s bracket ID, each bracket needs to observe more than \(10\)Trial
s forTPESampler
to adapt its search space.Thus, for example, if
HyperbandPruner
has \(4\) pruners in it, at least \(4 \times 10\) trials are consumed for startup.Note
Hyperband has several
SuccessiveHalvingPruner
. EachSuccessiveHalvingPruner
is referred as “bracket” in the original paper. The number of brackets is an important factor to control the early stopping behavior of Hyperband and is automatically determined bymin_resource
,max_resource
andreduction_factor
as The number of brackets = floor(log_{reduction_factor}(max_resource / min_resource)) + 1. Please setreduction_factor
so that the number of brackets is not too large (about 4 ~ 6 in most use cases). Please see Section 3.6 of the original paper for the detail.Example
We minimize an objective function with Hyperband pruning algorithm.
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import SGDClassifier from sklearn.model_selection import train_test_split import optuna X, y = load_iris(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y) classes = np.unique(y) n_train_iter = 100 def objective(trial): alpha = trial.suggest_uniform('alpha', 0.0, 1.0) clf = SGDClassifier(alpha=alpha) for step in range(n_train_iter): clf.partial_fit(X_train, y_train, classes=classes) intermediate_value = clf.score(X_valid, y_valid) trial.report(intermediate_value, step) if trial.should_prune(): raise optuna.TrialPruned() return clf.score(X_valid, y_valid) study = optuna.create_study( direction='maximize', pruner=optuna.pruners.HyperbandPruner( min_resource=1, max_resource=n_train_iter, reduction_factor=3 ) ) study.optimize(objective, n_trials=20)
 Parameters
min_resource – A parameter for specifying the minimum resource allocated to a trial noted as \(r\) in the paper. A smaller \(r\) will give a result faster, but a larger \(r\) will give a better guarantee of successful judging between configurations. See the details for
SuccessiveHalvingPruner
.max_resource –
A parameter for specifying the maximum resource allocated to a trial. \(R\) in the paper corresponds to
max_resource / min_resource
. This value represents and should match the maximum iteration steps (e.g., the number of epochs for neural networks). When this argument is “auto”, the maximum resource is estimated according to the completed trials. The default value of this argument is “auto”.Note
With “auto”, the maximum resource will be the largest step reported by
report()
in the first, or one of the first if trained in parallel, completed trial. No trials will be pruned until the maximum resource is determined.Note
If the step of the last intermediate value may change with each trial, please manually specify the maximum possible step to
max_resource
.reduction_factor – A parameter for specifying reduction factor of promotable trials noted as \(\eta\) in the paper. See the details for
SuccessiveHalvingPruner
.n_brackets –
Deprecated since version 1.4.0: This argument will be removed from
HyperbandPruner
. The number of brackets are automatically determined based onmin_resource
,max_resource
andreduction_factor
.The number of
SuccessiveHalvingPruner
s (brackets). Defaults to \(4\).min_early_stopping_rate_low –
Deprecated since version 1.4.0: This argument will be removed from
HyperbandPruner
.A parameter for specifying the minimum earlystopping rate. This parameter is related to a parameter that is referred to as \(s\) and used in Asynchronous SuccessiveHalving paper. The minimum early stopping rate for \(i\) th bracket is \(i + s\).
Note
Added in v1.1.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v1.1.0.

class
optuna.pruners.
ThresholdPruner
(lower: Optional[float] = None, upper: Optional[float] = None, n_warmup_steps: int = 0, interval_steps: int = 1)[source]¶ Pruner to detect outlying metrics of the trials.
Prune if a metric exceeds upper threshold, falls behind lower threshold or reaches
nan
.Example
from optuna import create_study from optuna.pruners import ThresholdPruner from optuna import TrialPruned def objective_for_upper(trial): for step, y in enumerate(ys_for_upper): trial.report(y, step) if trial.should_prune(): raise TrialPruned() return ys_for_upper[1] def objective_for_lower(trial): for step, y in enumerate(ys_for_lower): trial.report(y, step) if trial.should_prune(): raise TrialPruned() return ys_for_lower[1] ys_for_upper = [0.0, 0.1, 0.2, 0.5, 1.2] ys_for_lower = [100.0, 90.0, 0.1, 0.0, 1] n_trial_step = 5 study = create_study(pruner=ThresholdPruner(upper=1.0)) study.optimize(objective_for_upper, n_trials=10) study = create_study(pruner=ThresholdPruner(lower=0.0)) study.optimize(objective_for_lower, n_trials=10)
 Args
 lower:
A minimum value which determines whether pruner prunes or not. If an intermediate value is smaller than lower, it prunes.
 upper:
A maximum value which determines whether pruner prunes or not. If an intermediate value is larger than upper, it prunes.
 n_warmup_steps:
Pruning is disabled until the trial exceeds the given number of step.
 interval_steps:
Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported. Value must be at least 1.