Azure Databricks 上的 MLflow 模型服务 MLflow Model Serving on Azure Databricks

重要

此功能目前以公共预览版提供。This feature is in Public Preview.

MLflow 模型服务允许你将模型注册表中的机器学习模型作为 REST 终结点(它们基于模型版本的可用性和阶段自动更新)托管。MLflow Model Serving allows you to host machine learning models from Model Registry as REST endpoints that are updated automatically based on the availability of model versions and their stages.

为给定的已注册模型启用模型服务时,Azure Databricks 会自动为该模型创建唯一的单节点群集,并在该群集上部署该模型的所有未存档版本。When you enable model serving for a given registered model, Azure Databricks automatically creates a unique single-node cluster for the model and deploys all non-archived versions of the model on that cluster. 如果出现任何错误,Azure Databricks 会重启群集,并在你为模型禁用模型服务的情况下终止群集。Azure Databricks restarts the cluster if any error occurs, and terminates the cluster when you disable model serving for the model. 模型服务自动与模型注册表同步,并部署任何新的已注册模型版本。Model serving automatically syncs with Model Registry and deploys any new registered model versions. 可以通过标准 REST API 请求查询已部署的模型版本。Deployed model versions can be queried with standard REST API request. Azure Databricks 使用其标准身份验证对针对模型的请求进行身份验证。Azure Databricks authenticates requests to the model using its standard authentication.

当此服务为预览版时,Databricks 建议将其用于低吞吐量和非关键应用程序。While this service is in preview, Databricks recommends its use for low throughput and non-critical applications. 目标吞吐量为 20 qps,目标可用性为 99.5%,但任何一个都不保证能达到。Target throughput is 20 qps and target availability is 99.5%, although no guarantee is made as to either. 此外,有效负载大小限制为每个请求 16 MB。Additionally, there is a payload size limit of 16 MB per request.

每个模型版本都使用 MLflow 模型部署进行部署,并在其依赖项指定的 Conda 环境中运行。Each model version is deployed using MLflow model deployment and runs in a Conda environment specified by its dependencies.

备注

只要启用了服务,就会维护群集,即使不存在任何活动的模型版本。The cluster is maintained as long as serving is enabled, even if no active model version exists. 若要终止服务群集,请为已注册的模型禁用模型服务。To terminate the serving cluster, disable model serving for the registered model.

要求Requirements

MLflow 模型提供功能可用于 Python MLflow 模型。MLflow Model Serving is available for Python MLflow models. 所有模型依赖项都必须在 conda 环境中声明。All model dependencies must be declared in the conda environment.

启用和禁用模型服务Enable and disable model serving

可以从已注册模型页为模型启用服务。You enable a model for serving from its registered model page.

  1. 单击“服务”选项卡。如果尚未为模型启用服务,则会显示“启用服务”按钮。Click the Serving tab. If the model is not already enabled for serving, the Enable Serving button appears.
  2. 单击“启用服务”。Click Enable Serving. 将显示“服务”选项卡,其中的“状态”为“挂起”。The Serving tab appears with the Status as Pending. 几分钟后,“状态”更改为“就绪”。After a few minutes, the Status changes to Ready.

若要为模型禁用服务,请单击“停止”。To disable a model for serving, click Stop.

模型注册表中的模型服务Model serving from Model Registry

你可以在模型注册表 UI 中启用已注册模型的服务。You enable serving of a registered model in Model Registry UI.

启用服务Enable serving

模型版本 URIModel version URIs

每个已部署的模型版本都分配有一个或多个唯一 URI。Each deployed model version is assigned one or several unique URIs. 每个模型版本至少分配有一个如下构造的 URI:At minimum, each model version is assigned a URI constructed as follows:

<databricks-instance>/model/<registered-model-name>/<model-version>/invocations

例如,若要调用注册为 iris-classifier 的模型的版本 1,请使用以下 URI:For example, to call version 1 of a model registered as iris-classifier, use this URI:

https://<databricks-instance>/model/iris-classifier/1/invocations

还可以通过模型版本的阶段来调用模型版本。You can also call a model version by its stage. 例如,如果版本 1 处于“生产”阶段,则还可以使用以下 URI 对其进行评分:For example, if version 1 is in the Production stage, it can also be scored using this URI:

https://<databricks-instance>/model/iris-classifier/Production/invocations

可用模型 URI 的列表显示在服务页上的“模型版本”选项卡的顶部。The list of available model URIs appears at the top of the Model Versions tab on the serving page.

管理所服务的版本Manage served versions

所有活动的(未存档的)模型版本都会部署,你可以使用 URI 查询它们。All active (non-archived) model versions are deployed, and you can query them using the URIs. Azure Databricks 在新的模型版本注册时自动部署新版本,并在旧版本存档时自动删除旧版本。Azure Databricks automatically deploys new model versions when they are registered, and automatically removes old versions when they are archived.

备注

已注册模型的所有已部署版本共享同一群集。All deployed versions of a registered model share the same cluster.

管理模型访问权限Manage model access rights

模型访问权限从模型注册表继承而来。Model access rights are inherited from the Model Registry. 启用或禁用服务功能需要对已注册的模型拥有“管理”权限。Enabling or disabling the serving feature requires ‘manage’ permission on the registered model. 具有读取权限的任何人都可以对任何已部署的版本进行评分。Anyone with read rights can score any of the deployed versions.

对已部署的模型版本进行评分Score deployed model versions

若要对已部署的模型进行评分,可以使用 UI 或将 REST API 请求发送到模型 URI。To score a deployed model, you can use the UI or send a REST API request to the model URI.

通过 UI 进行评分Score via UI

这是用来测试模型的最简单且最快速的方法。This is the easiest and fastest way to test the model. 你可以插入 JSON 格式的模型输入数据,并单击“发送请求”。You can insert the model input data in JSON format and click Send Request. 如果已使用输入示例记录了模型(如上图所示),请单击“加载示例”来加载输入示例。If the model has been logged with an input example (as shown in the graphic above), click Load Example to load the input example.

通过 REST API 请求进行评分Score via REST API request

你可以使用标准 Databricks 身份验证通过 REST API 发送评分请求。You can send a scoring request through the REST API using standard Databricks authentication. 下面的示例演示了如何使用个人访问令牌进行身份验证。The examples below demonstrate authentication using a personal access token.

对于给定的 MODEL_VERSION_URI(例如 https://<databricks-instance>/model/iris-classifier/Production/invocations,其中的 <databricks-instance>你的 Databricks 实例的名称)和名为 DATABRICKS_API_TOKEN 的 Databricks REST API 令牌,下面提供了一些示例代码片段来说明如何查询所服务的模型:Given a MODEL_VERSION_URI like https://<databricks-instance>/model/iris-classifier/Production/invocations (where <databricks-instance> is the name of your Databricks instance) and a Databricks REST API token called DATABRICKS_API_TOKEN, here are some example snippets of how to query a served model:

BashBash

curl -u token:$DATABRICKS_API_TOKEN $MODEL_VERSION_URI \
  -H 'Content-Type: application/json; format=pandas-records' \
  -d '[
    {
      "sepal_length": 5.1,
      "sepal_width": 3.5,
      "petal_length": 1.4,
      "petal_width": 0.2
    }
  ]'

PythonPython

import requests

def score_model(model_uri, databricks_token, data):
  headers = {
    "Authorization": f"Bearer {databricks_token}",
    "Content-Type": "application/json; format=pandas-records",
  }
  data_json = data if isinstance(data, list) else data.to_dict(orient="records")
  response = requests.request(method='POST', headers=headers, url=model_uri, json=data_json)
  if response.status_code != 200:
      raise Exception(f"Request failed with status {response.status_code}, {response.text}")
  return response.json()

data = [{
  "sepal_length": 5.1,
  "sepal_width": 3.5,
  "petal_length": 1.4,
  "petal_width": 0.2
}]
score_model(MODEL_VERSION_URI, DATABRICKS_API_TOKEN, data)

# can also score DataFrames
import pandas as pd
score_model(MODEL_VERSION_URI, DATABRICKS_API_TOKEN, pd.DataFrame(data))

PowerbiPowerbi

可以使用以下步骤在 Power BI Desktop 中为数据集评分:You can score a dataset in Power BI Desktop using the following steps:

  1. 打开要评分的数据集。Open dataset you want to score.

  2. 转到“转换数据”。Go to Transform Data.

  3. 右键单击左侧面板,然后选择“创建新查询”。Right-click in the left panel and select Create New Query.

  4. 转到“视图”>“高级编辑器”。Go to View > Advanced Editor.

  5. 填写适当的 DATABRICKS_API_TOKENMODEL_VERSION_URI 后,将查询主体替换为下面的代码片段。Replace the query body with the code snippet below, after filling in an appropriate DATABRICKS_API_TOKEN and MODEL_VERSION_URI.

    (dataset as table ) as table =>
    let
      call_predict = (dataset as table ) as list =>
      let
        apiToken = DATABRICKS_API_TOKEN,
        modelUri = MODEL_VERSION_URI,
        responseList = Json.Document(Web.Contents(modelUri,
          [
            Headers = [
              #"Content-Type" = "application/json; format=pandas-records",
              #"Authorization" = Text.Format("Bearer #{0}", {apiToken})
            ],
            Content = Json.FromValue(dataset)
          ]
        ))
      in
        responseList,
      predictionList = List.Combine(List.Transform(Table.Split(dataset, 256), (x) => call_predict(x))),
      predictionsTable = Table.FromList(predictionList, (x) => {x}, {"Prediction"}),
      datasetWithPrediction = Table.Join(
        Table.AddIndexColumn(predictionsTable, "index"), "index",
        Table.AddIndexColumn(dataset, "index"), "index")
    in
      datasetWithPrediction
    
  6. 将查询命名为所需的模型名称。Name the query with your desired model name.

  7. 打开你的数据集的高级查询编辑器,并应用模型函数。Open the advanced query editor for your dataset and apply the model function.

若要详细了解服务器接受的输入数据格式(例如,面向 pandas 拆分的格式),请参阅 MLflow 文档For more information about input data formats accepted by the server (for example, pandas split-oriented format), see the MLflow documentation.

监视所服务的模型Monitor served models

服务页显示服务群集的状态指示器以及各个模型版本。The serving page displays status indicators for the serving cluster as well as individual model versions. 此外,还可以使用以下方式获取更多信息:In addition, you can use the following to obtain further information:

  • 若要检查服务群集的状态,请使用“模型事件”选项卡,其中显示了此模型的所有服务事件的列表。To inspect the state of the serving cluster, use the Model Events tab, which displays a list of all serving events for this model.
  • 若要检查单个模型版本的状态,请使用“模型版本”选项卡上的“日志”或“版本事件”选项卡。To inspect the state of a single model version, use the Logs or Version Events tabs on the Model Versions tab.

版本状态Version status

模型事件Model events