optuna.integration.OptunaSearchCV
- class optuna.integration.OptunaSearchCV(estimator, param_distributions, cv=5, enable_pruning=False, error_score=nan, max_iter=1000, n_jobs=1, n_trials=10, random_state=None, refit=True, return_train_score=False, scoring=None, study=None, subsample=1.0, timeout=None, verbose=0)[源代码]
Hyperparameter search with cross-validation.
- 参数
estimator (BaseEstimator) – Object to use to fit the data. This is assumed to implement the scikit-learn estimator interface. Either this needs to provide
score
, orscoring
must be passed.param_distributions (Mapping[str, optuna.distributions.BaseDistribution]) – Dictionary where keys are parameters and values are distributions. Distributions are assumed to implement the optuna distribution interface.
cv (Optional[Union[BaseCrossValidator, int]]) –
Cross-validation strategy. Possible inputs for cv are:
integer to specify the number of folds in a CV splitter,
a CV splitter,
an iterable yielding (train, validation) splits as arrays of indices.
For integer, if
estimator
is a classifier andy
is either binary or multiclass,sklearn.model_selection.StratifiedKFold
is used. otherwise,sklearn.model_selection.KFold
is used.enable_pruning (bool) – If
True
, pruning is performed in the case where the underlying estimator supportspartial_fit
.error_score (Union[numbers.Number, float, str]) – Value to assign to the score if an error occurs in fitting. If ‘raise’, the error is raised. If numeric,
sklearn.exceptions.FitFailedWarning
is raised. This does not affect the refit step, which will always raise the error.max_iter (int) – Maximum number of epochs. This is only used if the underlying estimator supports
partial_fit
.n_jobs (int) –
Number of
threading
based parallel jobs.-1
means using the number is set to CPU count.备注
n_jobs
allows parallelization usingthreading
and may suffer from Python’s GIL. It is recommended to use process-based parallelization iffunc
is CPU bound.警告
Deprecated in v2.7.0. This feature will be removed in the future. It is recommended to use process-based parallelization. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change. See https://github.com/optuna/optuna/releases/tag/v2.7.0.
n_trials (int) – Number of trials. If
None
, there is no limitation on the number of trials. Iftimeout
is also set toNone
, the study continues to create trials until it receives a termination signal such as Ctrl+C or SIGTERM. This trades off runtime vs quality of the solution.random_state (Optional[Union[int, numpy.random.mtrand.RandomState]]) – Seed of the pseudo random number generator. If int, this is the seed used by the random number generator. If
numpy.random.RandomState
object, this is the random number generator. IfNone
, the global random state fromnumpy.random
is used.refit (bool) – If
True
, refit the estimator with the best found hyperparameters. The refitted estimator is made available at thebest_estimator_
attribute and permits usingpredict
directly.return_train_score (bool) – If
True
, training scores will be included. Computing training scores is used to get insights on how different hyperparameter settings impact the overfitting/underfitting trade-off. However computing training scores can be computationally expensive and is not strictly required to select the hyperparameters that yield the best generalization performance.scoring (Optional[Union[Callable[[...], float], str]]) – String or callable to evaluate the predictions on the validation data. If
None
,score
on the estimator is used.study (Optional[optuna.study.Study]) – Study corresponds to the optimization task. If
None
, a new study is created.subsample (Union[float, int]) –
Proportion of samples that are used during hyperparameter search.
If int, then draw
subsample
samples.If float, then draw
subsample
*X.shape[0]
samples.
timeout (Optional[float]) – Time limit in seconds for the search of appropriate models. If
None
, the study is executed without time limitation. Ifn_trials
is also set toNone
, the study continues to create trials until it receives a termination signal such as Ctrl+C or SIGTERM. This trades off runtime vs quality of the solution.verbose (int) – Verbosity level. The higher, the more messages.
- 返回类型
None
- best_estimator_
Estimator that was chosen by the search. This is present only if
refit
is set toTrue
.
- n_splits_
Number of cross-validation splits.
- sample_indices_
Indices of samples that are used during hyperparameter search.
- scorer_
Scorer function.
- study_
Actual study.
实际案例
import optuna from sklearn.datasets import load_iris from sklearn.svm import SVC clf = SVC(gamma="auto") param_distributions = {"C": optuna.distributions.LogUniformDistribution(1e-10, 1e10)} optuna_search = optuna.integration.OptunaSearchCV(clf, param_distributions) X, y = load_iris(return_X_y=True) optuna_search.fit(X, y) y_pred = optuna_search.predict(X)
备注
Added in v0.17.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v0.17.0.
Methods
fit
(X[, y, groups])Run fit with all sets of parameters.
get_params
([deep])Get parameters for this estimator.
score
(X[, y])Return the score on the given data.
set_params
(**params)Set the parameters of this estimator.
Attributes
Index which corresponds to the best candidate parameter setting.
Parameters of the best trial in the
Study
.Mean cross-validated score of the best estimator.
Best trial in the
Study
.Class labels.
Call
decision_function
on the best estimator.Call
inverse_transform
on the best estimator.Actual number of trials.
Call
predict
on the best estimator.Call
predict_log_proba
on the best estimator.Call
predict_proba
on the best estimator.Call
score_samples
on the best estimator.Call
set_user_attr
on theStudy
.Call
transform
on the best estimator.All trials in the
Study
.Call
trials_dataframe
on theStudy
.User attributes in the
Study
.- property best_trial_: optuna.trial._frozen.FrozenTrial
Best trial in the
Study
.
- property decision_function: Callable[[...], Union[List[float], numpy.ndarray, pandas.core.series.Series, List[List[float]], pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
decision_function
on the best estimator.This is available only if the underlying estimator supports
decision_function
andrefit
is set toTrue
.
- fit(X, y=None, groups=None, **fit_params)[源代码]
Run fit with all sets of parameters.
- 参数
X (Union[List[List[float]], numpy.ndarray, pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]) – Training data.
y (Optional[Union[List[float], numpy.ndarray, pandas.core.series.Series, List[List[float]], pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]) – Target variable.
groups (Optional[Union[List[float], numpy.ndarray, pandas.core.series.Series]]) – Group labels for the samples used while splitting the dataset into train/validation set.
**fit_params – Parameters passed to
fit
on the estimator.fit_params (Any) –
- 返回
Return self.
- 返回类型
self
- get_params(deep=True)
Get parameters for this estimator.
- property inverse_transform: Callable[[...], Union[List[List[float]], numpy.ndarray, pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
inverse_transform
on the best estimator.This is available only if the underlying estimator supports
inverse_transform
andrefit
is set toTrue
.
- property predict: Callable[[...], Union[List[float], numpy.ndarray, pandas.core.series.Series, List[List[float]], pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
predict
on the best estimator.This is available only if the underlying estimator supports
predict
andrefit
is set toTrue
.
- property predict_log_proba: Callable[[...], Union[List[List[float]], numpy.ndarray, pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
predict_log_proba
on the best estimator.This is available only if the underlying estimator supports
predict_log_proba
andrefit
is set toTrue
.
- property predict_proba: Callable[[...], Union[List[List[float]], numpy.ndarray, pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
predict_proba
on the best estimator.This is available only if the underlying estimator supports
predict_proba
andrefit
is set toTrue
.
- property score_samples: Callable[[...], Union[List[float], numpy.ndarray, pandas.core.series.Series]]
Call
score_samples
on the best estimator.This is available only if the underlying estimator supports
score_samples
andrefit
is set toTrue
.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- 参数
**params (dict) – Estimator parameters.
- 返回
self – Estimator instance.
- 返回类型
estimator instance
- property transform: Callable[[...], Union[List[List[float]], numpy.ndarray, pandas.core.frame.DataFrame, scipy.sparse.base.spmatrix]]
Call
transform
on the best estimator.This is available only if the underlying estimator supports
transform
andrefit
is set toTrue
.
- property trials_: List[optuna.trial._frozen.FrozenTrial]
All trials in the
Study
.