optuna.integration.OptunaSearchCV
- class optuna.integration.OptunaSearchCV(estimator, param_distributions, cv=5, enable_pruning=False, error_score=nan, max_iter=1000, n_jobs=1, n_trials=10, random_state=None, refit=True, return_train_score=False, scoring=None, study=None, subsample=1.0, timeout=None, verbose=0, callbacks=None)[source]
Hyperparameter search with cross-validation.
- Parameters
estimator (BaseEstimator) – Object to use to fit the data. This is assumed to implement the scikit-learn estimator interface. Either this needs to provide
score, orscoringmust be passed.param_distributions (Mapping[str, BaseDistribution]) – Dictionary where keys are parameters and values are distributions. Distributions are assumed to implement the optuna distribution interface.
cv (Optional[Union[BaseCrossValidator, int]]) –
Cross-validation strategy. Possible inputs for cv are:
integer to specify the number of folds in a CV splitter,
a CV splitter,
an iterable yielding (train, validation) splits as arrays of indices.
For integer, if
estimatoris a classifier andyis either binary or multiclass,sklearn.model_selection.StratifiedKFoldis used. otherwise,sklearn.model_selection.KFoldis used.enable_pruning (bool) – If
True, pruning is performed in the case where the underlying estimator supportspartial_fit.error_score (Union[Number, float, str]) – Value to assign to the score if an error occurs in fitting. If ‘raise’, the error is raised. If numeric,
sklearn.exceptions.FitFailedWarningis raised. This does not affect the refit step, which will always raise the error.max_iter (int) – Maximum number of epochs. This is only used if the underlying estimator supports
partial_fit.n_jobs (int) –
Number of
threadingbased parallel jobs.-1means using the number is set to CPU count.Note
n_jobsallows parallelization usingthreadingand may suffer from Python’s GIL. It is recommended to use process-based parallelization iffuncis CPU bound.n_trials (int) – Number of trials. If
None, there is no limitation on the number of trials. Iftimeoutis also set toNone, the study continues to create trials until it receives a termination signal such as Ctrl+C or SIGTERM. This trades off runtime vs quality of the solution.random_state (Optional[Union[int, RandomState]]) – Seed of the pseudo random number generator. If int, this is the seed used by the random number generator. If
numpy.random.RandomStateobject, this is the random number generator. IfNone, the global random state fromnumpy.randomis used.refit (bool) – If
True, refit the estimator with the best found hyperparameters. The refitted estimator is made available at thebest_estimator_attribute and permits usingpredictdirectly.return_train_score (bool) – If
True, training scores will be included. Computing training scores is used to get insights on how different hyperparameter settings impact the overfitting/underfitting trade-off. However computing training scores can be computationally expensive and is not strictly required to select the hyperparameters that yield the best generalization performance.scoring (Optional[Union[Callable[[...], float], str]]) – String or callable to evaluate the predictions on the validation data. If
None,scoreon the estimator is used.study (Optional[Study]) – Study corresponds to the optimization task. If
None, a new study is created.subsample (Union[float, int]) –
Proportion of samples that are used during hyperparameter search.
If int, then draw
subsamplesamples.If float, then draw
subsample*X.shape[0]samples.
timeout (Optional[float]) – Time limit in seconds for the search of appropriate models. If
None, the study is executed without time limitation. Ifn_trialsis also set toNone, the study continues to create trials until it receives a termination signal such as Ctrl+C or SIGTERM. This trades off runtime vs quality of the solution.verbose (int) – Verbosity level. The higher, the more messages.
callbacks (Optional[List[Callable[[Study, FrozenTrial], None]]]) –
List of callback functions that are invoked at the end of each trial. Each function must accept two parameters with the following types in this order:
StudyandFrozenTrial.See also
See the tutorial of Callback for Study.optimize for how to use and implement callback functions.
- best_estimator_
Estimator that was chosen by the search. This is present only if
refitis set toTrue.
- n_splits_
Number of cross-validation splits.
- sample_indices_
Indices of samples that are used during hyperparameter search.
- scorer_
Scorer function.
- study_
Actual study.
Examples
import optuna from sklearn.datasets import load_iris from sklearn.svm import SVC clf = SVC(gamma="auto") param_distributions = { "C": optuna.distributions.FloatDistribution(1e-10, 1e10, log=True) } optuna_search = optuna.integration.OptunaSearchCV(clf, param_distributions) X, y = load_iris(return_X_y=True) optuna_search.fit(X, y) y_pred = optuna_search.predict(X)
Note
By following the scikit-learn convention for scorers, the direction of optimization is
maximize. See https://scikit-learn.org/stable/modules/model_evaluation.html. For the minimization problem, please multiply-1.Note
Added in v0.17.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v0.17.0.
Methods
fit(X[, y, groups])Run fit with all sets of parameters.
get_params([deep])Get parameters for this estimator.
score(X[, y])Return the score on the given data.
set_params(**params)Set the parameters of this estimator.
Attributes
Trial number which corresponds to the best candidate parameter setting.
Parameters of the best trial in the
Study.Mean cross-validated score of the best estimator.
Best trial in the
Study.Class labels.
Call
decision_functionon the best estimator.Call
inverse_transformon the best estimator.Actual number of trials.
Call
predicton the best estimator.Call
predict_log_probaon the best estimator.Call
predict_probaon the best estimator.Call
score_sampleson the best estimator.Call
set_user_attron theStudy.Call
transformon the best estimator.All trials in the
Study.Call
trials_dataframeon theStudy.User attributes in the
Study.- property best_index_: int
Trial number which corresponds to the best candidate parameter setting.
Retuned value is equivant to
optuna_search.best_trial_.number.
- property best_trial_: FrozenTrial
Best trial in the
Study.
- property decision_function: Callable[[...], Union[List[float], ndarray, Series, List[List[float]], DataFrame, spmatrix]]
Call
decision_functionon the best estimator.This is available only if the underlying estimator supports
decision_functionandrefitis set toTrue.
- fit(X, y=None, groups=None, **fit_params)[source]
Run fit with all sets of parameters.
- Parameters
X (Union[List[List[float]], ndarray, DataFrame, spmatrix]) – Training data.
y (Optional[Union[List[float], ndarray, Series, List[List[float]], DataFrame, spmatrix]]) – Target variable.
groups (Optional[Union[List[float], ndarray, Series]]) – Group labels for the samples used while splitting the dataset into train/validation set.
**fit_params (Any) – Parameters passed to
fiton the estimator.
- Returns
Return self.
- Return type
self
- get_params(deep=True)
Get parameters for this estimator.
- property inverse_transform: Callable[[...], Union[List[List[float]], ndarray, DataFrame, spmatrix]]
Call
inverse_transformon the best estimator.This is available only if the underlying estimator supports
inverse_transformandrefitis set toTrue.
- property predict: Callable[[...], Union[List[float], ndarray, Series, List[List[float]], DataFrame, spmatrix]]
Call
predicton the best estimator.This is available only if the underlying estimator supports
predictandrefitis set toTrue.
- property predict_log_proba: Callable[[...], Union[List[List[float]], ndarray, DataFrame, spmatrix]]
Call
predict_log_probaon the best estimator.This is available only if the underlying estimator supports
predict_log_probaandrefitis set toTrue.
- property predict_proba: Callable[[...], Union[List[List[float]], ndarray, DataFrame, spmatrix]]
Call
predict_probaon the best estimator.This is available only if the underlying estimator supports
predict_probaandrefitis set toTrue.
- property score_samples: Callable[[...], Union[List[float], ndarray, Series]]
Call
score_sampleson the best estimator.This is available only if the underlying estimator supports
score_samplesandrefitis set toTrue.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- property transform: Callable[[...], Union[List[List[float]], ndarray, DataFrame, spmatrix]]
Call
transformon the best estimator.This is available only if the underlying estimator supports
transformandrefitis set toTrue.
- property trials_: List[FrozenTrial]
All trials in the
Study.