Note
Go to the end to download the full example code.
Ablation Study Becomes Easy with BruteForceSampler
When conducting machine learning experiments, you often need to systematically evaluate all
combinations of techniques, hyperparameters, or components—a process known as an
ablation study.
A common approach is to write shell scripts with nested loops or use SLURM array jobs,
but these quickly become unwieldy when the search space has conditional structure
(e.g., “optimizer Adam has beta1 and beta2, but SGD has momentum”).
Frameworks like Hydra support multirun sweeps with config overrides,
but handling conditional parameters requires additional boilerplate.
Optuna’s BruteForceSampler solves this naturally: it exhaustively
enumerates all parameter combinations, including conditional (hierarchical) search spaces defined
via the define-by-run API. Combined with JournalStorage
(or alternatively RDBStorage with SQLite), it works seamlessly on HPC
clusters with shared filesystems, providing crash recovery and distributed execution out of the
box. Note that we focus only on JournalStorage in this tutorial
for simplicity. Please, however, note that if your environment makes trials error-prone,
RDBStorage might be a better option as detailed in Retrying Failed and Stale Trials.
This tutorial walks through four scenarios:
import optuna
Basic Ablation Study
Suppose you want to compare three optimizers and two learning rate schedules.
Define an objective function using suggest_categorical and suggest_float
(with a finite step), and let BruteForceSampler try every
combination.
def objective(trial: optuna.Trial) -> float:
optimizer = trial.suggest_categorical("optimizer", ["Adam", "SGD", "RMSprop"])
# If you would like to work on the log scale, you could also do like:
# lr = trial.suggest_categorical("lr", [10**(x/2) for x in range(-12, -5)])
# or lr = 10**trial.suggest_float("lr_exponent", -6, -3, step=0.5)
# You can store additional information via `trial.set_user_attr`.
lr = trial.suggest_float("lr", 0.001, 0.01, step=0.001)
lr_schedule = trial.suggest_categorical("lr_schedule", ["constant", "cosine"])
# In a real experiment, you would train a model here and return the metric.
# For demonstration, we use a mock score.
mock_scores = {"Adam": 0.9, "SGD": 0.85, "RMSprop": 0.88}
score = mock_scores[optimizer] + lr * 10 + (0.01 if lr_schedule == "cosine" else 0.0)
return score
sampler = optuna.samplers.BruteForceSampler()
study = optuna.create_study(direction="maximize", sampler=sampler)
study.optimize(objective)
/home/docs/checkouts/readthedocs.org/user_builds/optuna/checkouts/latest/tutorial/20_recipes/014_ablation_study_by_optuna.py:66: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
sampler = optuna.samplers.BruteForceSampler()
The study automatically stops once every combination has been evaluated. You can inspect results with a DataFrame:
print(f"Total trials: {len(study.trials)}")
for trial in study.trials[:5]:
print(f" Trial {trial.number}: {trial.params} -> {trial.value}")
Total trials: 60
Trial 0: {'optimizer': 'RMSprop', 'lr': 0.009, 'lr_schedule': 'constant'} -> 0.97
Trial 1: {'optimizer': 'RMSprop', 'lr': 0.005, 'lr_schedule': 'cosine'} -> 0.9400000000000001
Trial 2: {'optimizer': 'RMSprop', 'lr': 0.005, 'lr_schedule': 'constant'} -> 0.93
Trial 3: {'optimizer': 'RMSprop', 'lr': 0.009, 'lr_schedule': 'cosine'} -> 0.98
Trial 4: {'optimizer': 'SGD', 'lr': 0.008, 'lr_schedule': 'constant'} -> 0.9299999999999999
Note
BruteForceSampler requires all continuous parameters to have
a finite step. Using suggest_float without step will raise an error because
the search space would be infinite.
Conditional (Hierarchical) Search Space
A key advantage of BruteForceSampler over simple grid search
approaches is its support for conditional search spaces via Optuna’s define-by-run API.
For example, different optimizers may have different hyperparameters:
def objective_conditional(trial: optuna.Trial) -> float:
optimizer = trial.suggest_categorical("optimizer", ["Adam", "SGD"])
if optimizer == "Adam":
beta1 = trial.suggest_categorical("beta1", [0.9, 0.95])
beta2 = trial.suggest_categorical("beta2", [0.999, 0.9999])
config = f"Adam(beta1={beta1}, beta2={beta2})"
else:
momentum = trial.suggest_categorical("momentum", [0.0, 0.9, 0.99])
nesterov = trial.suggest_categorical("nesterov", [True, False])
config = f"SGD(momentum={momentum}, nesterov={nesterov})"
lr = trial.suggest_float("lr", 0.001, 0.01, step=0.001)
# Replace this with your actual training and evaluation code.
mock_score = hash(config) % 100 / 100 + lr
return mock_score
study_conditional = optuna.create_study(
direction="maximize", sampler=optuna.samplers.BruteForceSampler()
)
study_conditional.optimize(objective_conditional)
/home/docs/checkouts/readthedocs.org/user_builds/optuna/checkouts/latest/tutorial/20_recipes/014_ablation_study_by_optuna.py:116: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
direction="maximize", sampler=optuna.samplers.BruteForceSampler()
The sampler explores all valid paths through the conditional search space: 4 combinations for Adam (2 beta1 x 2 beta2) and 6 for SGD (3 momentum x 2 nesterov), each combined with 10 learning rate values, totaling (4 + 6) x 10 = 100 trials.
print(f"Total trials: {len(study_conditional.trials)}")
Total trials: 100
With shell scripts or array jobs, you would need to manually enumerate these conditional branches and compute the correct array indices. The define-by-run API makes this trivial.
Distributed Ablation Study on HPC Clusters
On an HPC cluster with a shared filesystem (e.g., NFS or Lustre),
JournalStorage enables multiple worker processes to collaborate
on the same ablation study without setting up a database server.
The following snippet shows a self-contained optimization script. Save it as a Python file and launch it from multiple nodes or SLURM array jobs—each process will pick up unevaluated combinations automatically.
import optuna
def objective(trial: optuna.Trial) -> float:
optimizer = trial.suggest_categorical("optimizer", ["Adam", "SGD"])
lr = trial.suggest_float("lr", 0.001, 0.01, step=0.001)
# Train your model and return the metric.
...
# Use a file path on the shared filesystem.
storage = optuna.storages.JournalStorage(
optuna.storages.journal.JournalFileBackend("/shared/nfs/ablation_journal.log"),
)
study = optuna.create_study(
study_name="my-ablation",
storage=storage,
direction="maximize",
sampler=optuna.samplers.BruteForceSampler(),
load_if_exists=True, # All workers join the same study.
)
study.optimize(objective)
With a SLURM job array, the submission script would be:
#!/bin/bash
#SBATCH --job-name=ablation
#SBATCH --array=0-7
#SBATCH --ntasks=1
python run_ablation.py
Each array task runs the same script. JournalStorage coordinates
the work: each worker picks up unevaluated parameter combinations, and the study
automatically stops once every combination has been evaluated.
Since JournalStorage replays its log file on startup, it handles
crash recovery naturally. If a worker fails, simply relaunch it—it will skip already
completed trials and resume from where it left off.
Tip
Use load_if_exists=True in optuna.create_study() so that all workers join the
same study instead of raising an error when the study already exists.
Retrying Failed and Stale Trials
BruteForceSampler treats FAIL trials as visited and will
not re-sample them. This means that if a trial raises an exception or returns an
infeasible value (e.g., NaN), that parameter combination is permanently skipped.
Similarly, if a worker process is killed or hangs, its trial remains in the RUNNING
state and blocks that parameter combination from being picked up by other workers.
JournalStorage does not support heartbeats, so it cannot
detect stale trials automatically. If you need automatic retry of failed or stale trials
in a distributed setting, use RDBStorage with the heartbeat
mechanism and RetryFailedTrialCallback:
import optuna
from optuna.storages import RetryFailedTrialCallback
def objective(trial: optuna.Trial) -> float:
optimizer = trial.suggest_categorical("optimizer", ["Adam", "SGD"])
lr = trial.suggest_float("lr", 0.001, 0.01, step=0.001)
# Train your model and return the metric.
...
storage = optuna.storages.RDBStorage(
url="sqlite:////shared/nfs/ablation.db",
heartbeat_interval=60, # Record heartbeat every 60 seconds.
grace_period=120, # Mark trials as FAIL if no heartbeat for 120 seconds.
failed_trial_callback=RetryFailedTrialCallback(max_retry=3),
)
study = optuna.create_study(
study_name="my-ablation",
storage=storage,
direction="maximize",
sampler=optuna.samplers.BruteForceSampler(),
load_if_exists=True,
)
study.optimize(objective)
With this setup, when a worker’s heartbeat stops (e.g., the process is killed or hangs),
RDBStorage automatically marks the stale trial as FAIL
after grace_period seconds. RetryFailedTrialCallback then
re-enqueues it so that another worker can retry the same parameter combination.
Note
If you are using JournalStorage and a trial hangs, you need
to manually re-enqueue the stuck trials after killing the hung process:
from datetime import datetime
# If a trial is running longer than a day, we cut it.
# Please adopt the duration depending on your application.
grace_period = 3600*24
for trial in study.trials:
if trial.state == optuna.trial.TrialState.RUNNING:
if (datetime.now() - trial.datetime_start).total_seconds() > grace_period:
study.tell(trial, state=optuna.trial.TrialState.FAIL)
study.enqueue_trial(trial.params)
See also
BruteForceSamplerfor API details.(File-based) Journal Storage for more on
JournalStorage.RetryFailedTrialCallbackfor automatic retry of failed trials.
Total running time of the script: (0 minutes 0.286 seconds)