Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Optuna is an open-source Python library for hyperparameter tuning that can be scaled horizontally across multiple compute resources. Optuna also integrates with MLflow for model and trial tracking and monitoring.
Use the following commands to install Optuna and its integration module.
%pip install optuna
%pip install optuna-integration # Integration with MLflow
Here are the steps in a Optuna workflow:
- Define an objective function to optimize. Within the objective function, define the hyperparameter search space.
- Create an Optuna Study object, and run the tuning algorithm by calling the
optimize
function of the Study object.
Below is a minimal example from the Optuna documentation.
- Define objective function
objective
, and call thesuggest_float
function to define the search space for the parameterx
. - Create a Study, and optimize the
objective
function with 100 trials, i.e., 100 calls of theobjective
function with different values ofx
. - Get the best parameters of the Study
def objective(trial):
x = trial.suggest_float("x", -10, 10)
return (x - 2) ** 2
study = optuna.create_study()
study.optimize(objective, n_trials=100)
best_params = study.best_params
You can distribute Optuna trials to multiple machines in an Azure Databricks cluster with Joblib Apache Spark Backend.
import joblib
from joblibspark import register_spark
register_spark() # register Spark backend for Joblib
with joblib.parallel_backend("spark", n_jobs=-1):
study.optimize(objective, n_trials=100)
To track hyperparameters and metrics of all the Optuna trials, use the MLflowCallback
of Optuna Integration modules when you call the optimize
function.
import mlflow
from optuna.integration.mlflow import MLflowCallback
mlflow_callback = MLflowCallback(
tracking_uri="databricks",
metric_name="accuracy",
create_experiment=False,
mlflow_kwargs={
"experiment_id": experiment_id
}
)
study.optimize(objective, n_trials=100, callbacks=[mlflow_callback])
This notebook provides an example of using Optuna to select a scikit-learn model and a set of hyperparameters for the Iris dataset.
On top of a single-machine Optuna workflow, the notebook showcases how to
- Parallelize Optuna trials to multiple machines via Joblib
- Track trial runs with MLflow