Easy Parallelization
It’s straightforward to parallelize optuna.study.Study.optimize()
.
If you want to manually execute Optuna optimization:
start an RDB server (this example uses MySQL)
create a study with --storage
argument
share the study among multiple nodes and processes
Of course, you can use Kubernetes as in the kubernetes examples.
To just see how parallel optimization works in Optuna, check the below video.
Create a Study
You can create a study using optuna create-study
command.
Alternatively, in Python script you can use optuna.create_study()
.
$ mysql -u root -e "CREATE DATABASE IF NOT EXISTS example"
$ optuna create-study --study-name "distributed-example" --storage "mysql://root@localhost/example"
[I 2020-07-21 13:43:39,642] A new study created with name: distributed-example
Then, write an optimization script. Let’s assume that foo.py
contains the following code.
import optuna
def objective(trial):
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
if __name__ == "__main__":
study = optuna.load_study(
study_name="distributed-example", storage="mysql://root@localhost/example"
)
study.optimize(objective, n_trials=100)
Share the Study among Multiple Nodes and Processes
Finally, run the shared study from multiple processes.
For example, run Process 1
in a terminal, and do Process 2
in another one.
They get parameter suggestions based on shared trials’ history.
Process 1:
$ python foo.py
[I 2020-07-21 13:45:02,973] Trial 0 finished with value: 45.35553104173011 and parameters: {'x': 8.73465151598285}. Best is trial 0 with value: 45.35553104173011.
[I 2020-07-21 13:45:04,013] Trial 2 finished with value: 4.6002397305938905 and parameters: {'x': 4.144816945707463}. Best is trial 1 with value: 0.028194513284051464.
...
Process 2 (the same command as process 1):
$ python foo.py
[I 2020-07-21 13:45:03,748] Trial 1 finished with value: 0.028194513284051464 and parameters: {'x': 1.8320877810162361}. Best is trial 1 with value: 0.028194513284051464.
[I 2020-07-21 13:45:05,783] Trial 3 finished with value: 24.45966755098074 and parameters: {'x': 6.945671597566982}. Best is trial 1 with value: 0.028194513284051464.
...
Note
n_trials
is the number of trials each process will run, not the total number of trials across all processes. For example, the script given above runs 100 trials for each process, 100 trials * 2 processes = 200 trials. optuna.study.MaxTrialsCallback
can ensure how many times trials will be performed across all processes.
Note
We do not recommend SQLite for distributed optimizations at scale because it may cause deadlocks and serious performance issues. Please consider to use another database engine like PostgreSQL or MySQL.
Total running time of the script: (0 minutes 0.000 seconds)
Gallery generated by Sphinx-Gallery