optuna.samplers.TPESampler

class optuna.samplers.TPESampler(consider_prior=True, prior_weight=1.0, consider_magic_clip=True, consider_endpoints=False, n_startup_trials=10, n_ei_candidates=24, gamma=<function default_gamma>, weights=<function default_weights>, seed=None, *, multivariate=False, group=False, warn_independent_sampling=True)[source]

Sampler using TPE (Tree-structured Parzen Estimator) algorithm.

This sampler is based on independent sampling. See also BaseSampler for more details of ‘independent sampling’.

On each trial, for each parameter, TPE fits one Gaussian Mixture Model (GMM) l(x) to the set of parameter values associated with the best objective values, and another GMM g(x) to the remaining parameter values. It chooses the parameter value x that maximizes the ratio l(x)/g(x).

For further information about TPE algorithm, please refer to the following papers:

Example

import optuna
from optuna.samplers import TPESampler


def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return x ** 2


study = optuna.create_study(sampler=TPESampler())
study.optimize(objective, n_trials=10)
Parameters
  • consider_prior – Enhance the stability of Parzen estimator by imposing a Gaussian prior when True. The prior is only effective if the sampling distribution is either UniformDistribution, DiscreteUniformDistribution, LogUniformDistribution, IntUniformDistribution, or IntLogUniformDistribution.

  • prior_weight – The weight of the prior. This argument is used in UniformDistribution, DiscreteUniformDistribution, LogUniformDistribution, IntUniformDistribution, IntLogUniformDistribution, and CategoricalDistribution.

  • consider_magic_clip – Enable a heuristic to limit the smallest variances of Gaussians used in the Parzen estimator.

  • consider_endpoints – Take endpoints of domains into account when calculating variances of Gaussians in Parzen estimator. See the original paper for details on the heuristics to calculate the variances.

  • n_startup_trials – The random sampling is used instead of the TPE algorithm until the given number of trials finish in the same study.

  • n_ei_candidates – Number of candidate samples used to calculate the expected improvement.

  • gamma – A function that takes the number of finished trials and returns the number of trials to form a density function for samples with low grains. See the original paper for more details.

  • weights

    A function that takes the number of finished trials and returns a weight for them. See Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures for more details.

  • seed – Seed for random number generator.

  • multivariate

    If this is True, the multivariate TPE is used when suggesting parameters. The multivariate TPE is reported to outperform the independent TPE. See BOHB: Robust and Efficient Hyperparameter Optimization at Scale for more details.

    Note

    Added in v2.2.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v2.2.0.

  • group

    If this and multivariate are True, the multivariate TPE with the group decomposed search space is used when suggesting parameters. The sampling algorithm decomposes the search space based on past trials and samples from the joint distribution in each decomposed subspace. The decomposed subspaces are a partition of the whole search space. Each subspace is a maximal subset of the whole search space, which satisfies the following: for a trial in completed trials, the intersection of the subspace and the search space of the trial becomes subspace itself or an empty set.

    The search space is decomposed based on the following recursive rules.

    • Initialize the group of the search space with the empty set. The elements of the group are the subset of the search space, and the type is the dictionary of BaseDistribution.

    • Update the group with the following procedure ADD(trial) by looking at past trials.

    The procedure of Add(trial) is

    • Let T = trial.distributions.

    • If the intersection of any element of the group and T is empty, add T to the group.

    • If an element S of the group is contained in T, then add T-S to the group. We recursively add T-S to the group because the intersection of T-S and some other elements of the group may not be empty.

    • If an element S of a group contains T, remove S from the group and add T and S-T to the group.

    • If the intersection of an element S of the group and T is not empty, remove S from the group and add S∩T, S-T, and T-S to the group. We recursively add T-S to the group because the intersection of T-S and some other elements of the group may not be empty.

    The group of the search space recursively constructed based on the above rules are disjoint and the union is the entire search space. We perform sampling from the joint distribution for each element of this decomposed group of the search space.

    Sampling from the joint distribution on the subspace is realized by multivariate TPE.

    Note

    Added in v2.8.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v2.8.0.

    Example:

    import optuna
    
    
    def objective(trial):
        x = trial.suggest_categorical("x", ["A", "B"])
        if x == "A":
            return trial.suggest_float("y", -10, 10)
        else:
            return trial.suggest_int("z", -10, 10)
    
    
    sampler = optuna.samplers.TPESampler(multivariate=True, group=True)
    study = optuna.create_study(sampler=sampler)
    study.optimize(objective, n_trials=10)
    

  • warn_independent_sampling – If this is True and multivariate=True, a warning message is emitted when the value of a parameter is sampled by using an independent sampler. If multivariate=False, this flag has no effect.

Raises

ValueError – If multivariate is False and group is True.

Methods

after_trial(study, trial, state, values)

Trial post-processing.

hyperopt_parameters()

Return the the default parameters of hyperopt (v0.1.2).

infer_relative_search_space(study, trial)

Infer the search space that will be used by relative sampling in the target trial.

reseed_rng()

Reseed sampler’s random number generator.

sample_independent(study, trial, param_name, …)

Sample a parameter for a given distribution.

sample_relative(study, trial, search_space)

Sample parameters in a given search space.

after_trial(study, trial, state, values)[source]

Trial post-processing.

This method is called after the objective function returns and right before the trials is finished and its state is stored.

Note

Added in v2.4.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v2.4.0.

Parameters
  • study (optuna.study.Study) – Target study object.

  • trial (optuna.trial._frozen.FrozenTrial) – Target trial object. Take a copy before modifying this object.

  • state (optuna.trial._state.TrialState) – Resulting trial state.

  • values (Optional[Sequence[float]]) – Resulting trial values. Guaranteed to not be None if trial succeeded.

Return type

None

static hyperopt_parameters()[source]

Return the the default parameters of hyperopt (v0.1.2).

TPESampler can be instantiated with the parameters returned by this method.

Example

Create a TPESampler instance with the default parameters of hyperopt.

import optuna
from optuna.samplers import TPESampler


def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return x ** 2


sampler = TPESampler(**TPESampler.hyperopt_parameters())
study = optuna.create_study(sampler=sampler)
study.optimize(objective, n_trials=10)
Returns

A dictionary containing the default parameters of hyperopt.

Return type

Dict[str, Any]

infer_relative_search_space(study, trial)[source]

Infer the search space that will be used by relative sampling in the target trial.

This method is called right before sample_relative() method, and the search space returned by this method is passed to it. The parameters not contained in the search space will be sampled by using sample_independent() method.

Parameters
  • study (optuna.study.Study) – Target study object.

  • trial (optuna.trial._frozen.FrozenTrial) – Target trial object. Take a copy before modifying this object.

Returns

A dictionary containing the parameter names and parameter’s distributions.

Return type

Dict[str, optuna.distributions.BaseDistribution]

See also

Please refer to intersection_search_space() as an implementation of infer_relative_search_space().

reseed_rng()[source]

Reseed sampler’s random number generator.

This method is called by the Study instance if trials are executed in parallel with the option n_jobs>1. In that case, the sampler instance will be replicated including the state of the random number generator, and they may suggest the same values. To prevent this issue, this method assigns a different seed to each random number generator.

Return type

None

sample_independent(study, trial, param_name, param_distribution)[source]

Sample a parameter for a given distribution.

This method is called only for the parameters not contained in the search space returned by sample_relative() method. This method is suitable for sampling algorithms that do not use relationship between parameters such as random sampling and TPE.

Note

The failed trials are ignored by any build-in samplers when they sample new parameters. Thus, failed trials are regarded as deleted in the samplers’ perspective.

Parameters
  • study (optuna.study.Study) – Target study object.

  • trial (optuna.trial._frozen.FrozenTrial) – Target trial object. Take a copy before modifying this object.

  • param_name (str) – Name of the sampled parameter.

  • param_distribution (optuna.distributions.BaseDistribution) – Distribution object that specifies a prior and/or scale of the sampling algorithm.

Returns

A parameter value.

Return type

Any

sample_relative(study, trial, search_space)[source]

Sample parameters in a given search space.

This method is called once at the beginning of each trial, i.e., right before the evaluation of the objective function. This method is suitable for sampling algorithms that use relationship between parameters such as Gaussian Process and CMA-ES.

Note

The failed trials are ignored by any build-in samplers when they sample new parameters. Thus, failed trials are regarded as deleted in the samplers’ perspective.

Parameters
  • study (optuna.study.Study) – Target study object.

  • trial (optuna.trial._frozen.FrozenTrial) – Target trial object. Take a copy before modifying this object.

  • search_space (Dict[str, optuna.distributions.BaseDistribution]) – The search space returned by infer_relative_search_space().

Returns

A dictionary containing the parameter names and the values.

Return type

Dict[str, Any]