从 Azure Databricks 外部访问 MLflow 跟踪服务器Access the MLflow tracking server from outside Azure Databricks

你可能希望从自己的应用程序或从 MLflow CLI 登录到 MLflow 跟踪服务器。You may wish to log to the MLflow tracking server from your own applications or from the MLflow CLI.

本文介绍了所需的配置步骤。This article describes the required configuration steps. 首先安装 MLflow 并配置凭据(步骤 0)。Start by installing MLflow and configuring your credentials (Step 0). 然后,可以配置应用程序(步骤 1a),也可以配置 MLflow CLI(步骤 1b)。You can then either configure an application (Step 1a) or configure the MLflow CLI (Step 1b).

有关如何启动并登录到开放源代码跟踪服务器的信息,请参阅开放源代码文档For information on how to launch and log to an open-source tracking server, see the open source documentation.

步骤 0:配置环境Step 0: Configure your environment

如果你没有 Azure Databricks 工作区,则可以免费试用 DatabricksIf you don’t have a Azure Databricks Workspace, you can try Databricks for free.

配置环境以访问 Azure Databricks 托管 MLflow 跟踪服务器:To configure your environment to access your Azure Databricks hosted MLflow tracking server:

  1. 使用 pip install mlflow 安装 MLflow。Install MLflow using pip install mlflow.
  2. 配置身份验证。Configure authentication. 执行以下操作之一:Do one of:
    • 使用 databricks configure --token 生成 REST API 令牌 并创建凭据文件。Generate a REST API token and create a credentials file using databricks configure --token.

    • 通过环境变量指定凭据:Specify credentials via environment variables:

      # Configure MLflow to communicate with a Databricks-hosted tracking server
      export MLFLOW_TRACKING_URI=databricks
      # Specify the workspace hostname and token
      export DATABRICKS_HOST="..."
      export DATABRICKS_TOKEN="..."
      

步骤 1a:配置 MLflow 应用程序Step 1a: Configure MLflow applications

通过将跟踪 URI 设置databricksdatabricks://<profileName>(如果在创建凭据文件时通过 --profile 指定了配置文件名称),将 MLflow 应用程序配置为登录到 Azure Databricks。Configure MLflow applications to log to Azure Databricks by setting the tracking URI to databricks, or databricks://<profileName>, if you specified a profile name via --profile while creating your credentials file. 例如,可以通过将 MLFLOW_TRACKING_URI 环境变量设置为“databricks”来实现此目的。For example, you can achieve this by setting the MLFLOW_TRACKING_URI environment variable to “databricks”.

步骤 1b:配置 MLflow CLI Step 1b: Configure the MLflow CLI

配置 MLflow CLI 以使用 MLFLOW_TRACKING_URI 环境变量与 Azure Databricks 跟踪服务器进行通信。Configure the MLflow CLI to communicate with an Azure Databricks tracking server with the MLFLOW_TRACKING_URI environment variable. 例如,若要结合使用 CLI 和跟踪 URI databricks 来创建试验,请运行:For example, to create an experiment using the CLI with the tracking URI databricks, run:

# Replace <your-username> with your Databricks username
export MLFLOW_TRACKING_URI=databricks
mlflow experiments create -n /Users/<your-username>/my-experiment