使用自定义 Docker 基础映像部署模型Deploy a model using a custom Docker base image

适用于:是基本版是企业版               (升级到企业版APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

了解如何在使用 Azure 机器学习部署已训练的模型时使用自定义 Docker 基础映像。Learn how to use a custom Docker base image when deploying trained models with Azure Machine Learning.

将已训练的模型部署到 Web 服务或 IoT 边缘设备时,将创建一个包,其中包含用于处理传入请求的 Web 服务器。When you deploy a trained model to a web service or IoT Edge device, a package is created which contains a web server to handle incoming requests.

Azure 机器学习提供了一个默认的 Docker 基础映像,因此你无需担心创建基础映像的问题。Azure Machine Learning provides a default Docker base image so you don't have to worry about creating one. 你还可以使用 Azure 机器学习环境来选择特定的基础映像,或者使用自己的基础映像 。You can also use Azure Machine Learning environments to select a specific base image, or use a custom one that you provide.

为部署创建映像时,可以从基础映像着手。A base image is used as the starting point when an image is created for a deployment. 基础映像提供基本的操作系统和组件。It provides the underlying operating system and components. 然后,部署过程会在部署基础映像之前将其他组件(如模型、Conda 环境和其他资产)添加到该映像中。The deployment process then adds additional components, such as your model, conda environment, and other assets, to the image before deploying it.

一般而言,如果希望使用 Docker 来管理依赖项、对组件版本进行更严格的控制或节省部署时间,则可以创建自定义基础映像。Typically, you create a custom base image when you want to use Docker to manage your dependencies, maintain tighter control over component versions or save time during deployment. 例如,你可能希望对 Python、Conda 或其他组件的特定版本进行标准化。For example, you might want to standardize on a specific version of Python, Conda, or other component. 你可能还希望安装模型所需的软件,而安装过程需要很长时间。You might also want to install software required by your model, where the installation process takes a long time. 如果在创建基础映像时安装软件,你就不必为每个部署安装它。Installing the software when creating the base image means that you don't have to install it for each deployment.

Important

部署模型时,不能覆盖核心组件,如 Web 服务器或 IoT Edge 组件。When you deploy a model, you cannot override core components such as the web server or IoT Edge components. 这些组件提供已知的工作环境并由 Microsoft 进行测试和支持。These components provide a known working environment that is tested and supported by Microsoft.

Warning

Microsoft 可能无法帮助解决由自定义映像引起的问题。Microsoft may not be able to help troubleshoot problems caused by a custom image. 如果遇到问题,请使用默认映像或 Microsoft 提供的映像之一,了解该问题是否特定于你的映像。If you encounter problems, you may be asked to use the default image or one of the images Microsoft provides to see if the problem is specific to your image.

本文档分为两个部分:This document is broken into two sections:

  • 创建自定义基础映像:为管理员和 DevOps 提供有关使用 Azure CLI 和机器学习 CLI 创建自定义映像和配置 Azure 容器注册表的身份验证的信息。Create a custom base image: Provides information to admins and DevOps on creating a custom image and configuring authentication to an Azure Container Registry using the Azure CLI and Machine Learning CLI.
  • 使用自定义基础映像部署模型:向数据科学家和 DevOps/ML 工程师提供有关在从 Python SDK 或 ML CLI 部署定型模型时使用自定义映像的信息。Deploy a model using a custom base image: Provides information to Data Scientists and DevOps / ML Engineers on using custom images when deploying a trained model from the Python SDK or ML CLI.

先决条件Prerequisites

创建自定义基础映像Create a custom base image

本部分中的信息假设你正在使用 Azure 容器注册表存储 Docker 映像。The information in this section assumes that you are using an Azure Container Registry to store Docker images. 计划为 Azure 机器学习创建自定义映像时,请使用以下清单:Use the following checklist when planning to create custom images for Azure Machine Learning:

  • 你将使用为 Azure 机器学习工作区创建的 Azure 容器注册表,还是使用独立的 Azure 容器注册表?Will you use the Azure Container Registry created for the Azure Machine Learning workspace, or a standalone Azure Container Registry?

    使用存储在工作区的容器注册表中的映像时,不需要对注册表进行身份验证 。When using images stored in the container registry for the workspace, you do not need to authenticate to the registry. 身份验证由工作区处理。Authentication is handled by the workspace.

    Warning

    首次使用工作区训练或部署模型时,将创建工作区的 Azure 容器注册表 。The Azure Container Registry for your workspace is created the first time you train or deploy a model using the workspace. 如果你创建了一个新的工作区,但没有训练或创建模型,则该工作区将不存在 Azure 容器注册表。If you've created a new workspace, but not trained or created a model, no Azure Container Registry will exist for the workspace.

    有关检索工作区的 Azure 容器注册表名称的信息,请参阅本文的获取容器注册表名称部分。For information on retrieving the name of the Azure Container Registry for your workspace, see the Get container registry name section of this article.

    使用存储在“独立容器注册表”中的映像时,需要配置至少具有读取访问权限的服务主体 。When using images stored in a standalone container registry, you will need to configure a service principal that has at least read access. 然后向使用注册表中的映像的任何人提供服务主体 ID(用户名)和密码。You then provide the service principal ID (username) and password to anyone that uses images from the registry. 但你使容器注册表可公开访问的情况例外。The exception is if you make the container registry publicly accessible.

    有关如何创建专用 Azure 容器注册表的信息,请参阅创建专用容器注册表For information on creating a private Azure Container Registry, see Create a private container registry.

    有关在 Azure 容器注册表中使用服务主体的信息,请参阅使用服务主体的 Azure 容器注册表身份验证For information on using service principals with Azure Container Registry, see Azure Container Registry authentication with service principals.

  • Azure 容器注册表和映像信息:请为需要使用映像的任何人提供映像名。Azure Container Registry and image information: Provide the image name to anyone that needs to use it. 例如,使用名为 myimage 的映像(存储在名为 myregistry 的注册表中)进行模型部署时,该映像将被引用为 myregistry.azurecr.io/myimageFor example, an image named myimage, stored in a registry named myregistry, is referenced as myregistry.azurecr.io/myimage when using the image for model deployment

  • 映像要求:Azure 机器学习仅支持提供以下软件的 Docker 映像:Image requirements: Azure Machine Learning only supports Docker images that provide the following software:

    • Ubuntu 16.04 或更高版本。Ubuntu 16.04 or greater.
    • Conda 4.5.# 或更高版本。Conda 4.5.# or greater.
    • Python 3.5.# 或 3.6.#。Python 3.5.# or 3.6.#.

获取容器注册表信息Get container registry information

本部分介绍如何获取 Azure 机器学习工作区的 Azure 容器注册表的名称。In this section, learn how to get the name of the Azure Container Registry for your Azure Machine Learning workspace.

Warning

首次使用工作区训练或部署模型时,将创建工作区的 Azure 容器注册表 。The Azure Container Registry for your workspace is created the first time you train or deploy a model using the workspace. 如果你创建了一个新的工作区,但没有训练或创建模型,则该工作区将不存在 Azure 容器注册表。If you've created a new workspace, but not trained or created a model, no Azure Container Registry will exist for the workspace.

如果已使用 Azure 机器学习训练或部署了模型,则会为你的工作区创建容器注册表。If you've already trained or deployed models using Azure Machine Learning, a container registry was created for your workspace. 若要查找此容器注册表的名称,请使用以下步骤:To find the name of this container registry, use the following steps:

  1. 打开新的 shell 或命令提示符,并使用以下命令对 Azure 订阅进行身份验证:Open a new shell or command-prompt and use the following command to authenticate to your Azure subscription:

    az login
    

    按照提示对订阅进行身份验证。Follow the prompts to authenticate to the subscription.

  2. 使用以下命令列出工作区的容器注册表。Use the following command to list the container registry for the workspace. <myworkspace> 替换为 Azure 机器学习工作区名称。Replace <myworkspace> with your Azure Machine Learning workspace name. <resourcegroup> 替换为包含工作区的 Azure 资源组:Replace <resourcegroup> with the Azure resource group that contains your workspace:

    az ml workspace show -w <myworkspace> -g <resourcegroup> --query containerRegistry
    

    Tip

    如果收到一条错误消息,指出未安装 ml 扩展,请使用以下命令进行安装:If you get an error message stating that the ml extension isn't installed, use the following command to install it:

    az extension add -n azure-cli-ml
    

    返回的信息类似于下文:The information returned is similar to the following text:

    /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.ContainerRegistry/registries/<registry_name>
    

    <registry_name> 值是工作区的 Azure 容器注册表的名称。The <registry_name> value is the name of the Azure Container Registry for your workspace.

生成自定义基础映像Build a custom base image

本部分中的步骤将介绍如何在 Azure 容器注册表中创建自定义 Docker 映像。The steps in this section walk-through creating a custom Docker image in your Azure Container Registry.

  1. 创建名为 Dockerfile 的新文本文件,并将以下文本用作内容:Create a new text file named Dockerfile, and use the following text as the contents:

    FROM ubuntu:16.04
    
    ARG CONDA_VERSION=4.5.12
    ARG PYTHON_VERSION=3.6
    
    ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
    ENV PATH /opt/miniconda/bin:$PATH
    
    RUN apt-get update --fix-missing && \
        apt-get install -y wget bzip2 && \
        apt-get clean && \
        rm -rf /var/lib/apt/lists/*
    
    RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh -O ~/miniconda.sh && \
        /bin/bash ~/miniconda.sh -b -p /opt/miniconda && \
        rm ~/miniconda.sh && \
        /opt/miniconda/bin/conda clean -tipsy
    
    RUN conda install -y conda=${CONDA_VERSION} python=${PYTHON_VERSION} && \
        conda clean -aqy && \
        rm -rf /opt/miniconda/pkgs && \
        find / -type d -name __pycache__ -prune -exec rm -rf {} \;
    
  2. 在 shell 或命令提示符下,使用以下命令对 Azure 容器注册表进行身份验证。From a shell or command-prompt, use the following to authenticate to the Azure Container Registry. <registry_name> 替换为要将映像存储在其中的容器注册表的名称:Replace the <registry_name> with the name of the container registry you want to store the image in:

    az acr login --name <registry_name>
    
  3. 若要上传 Dockerfile 并生成它,请使用以下命令。To upload the Dockerfile, and build it, use the following command. <registry_name> 替换为要将映像存储在其中的容器注册表的名称:Replace <registry_name> with the name of the container registry you want to store the image in:

    az acr build --image myimage:v1 --registry <registry_name> --file Dockerfile .
    

    Tip

    在此示例中,标记 :v1 将应用于映像。In this example, a tag of :v1 is applied to the image. 如果未提供标记,则应用标记 :latestIf no tag is provided, a tag of :latest is applied.

    在生成过程中,信息将被流式传输回命令行。During the build process, information is streamed to back to the command line. 如果生成成功,你将收到类似于以下文本的消息:If the build is successful, you receive a message similar to the following text:

    Run ID: cda was successful after 2m56s
    

如需深入了解如何使用 Azure 容器注册表生成映像,请参阅使用 Azure 容器注册表任务生成和运行容器映像For more information on building images with an Azure Container Registry, see Build and run a container image using Azure Container Registry Tasks

如需深入了解如何将现有映像上传到 Azure 容器注册表,请参阅将首个映像推送到专用 Docker 容器注册表For more information on uploading existing images to an Azure Container Registry, see Push your first image to a private Docker container registry.

使用自定义基础映像Use a custom base image

若要使用自定义映像,需要以下信息:To use a custom image, you need the following information:

  • 映像名称 。The image name. 例如,mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda 是 Microsoft 提供的基础 Docker 映像的路径。For example, mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda is the path to a basic Docker Image provided by Microsoft.

    Important

    对于已创建的自定义映像,请确保包含用于该映像的任何标记。For custom images that you've created, be sure to include any tags that were used with the image. 例如,如果映像是使用特定标记(如 :v1)创建的。For example, if your image was created with a specific tag, such as :v1. 如果创建映像时未使用特定标记,则应用标记 :latestIf you did not use a specific tag when creating the image, a tag of :latest was applied.

  • 如果映像位于“专用存储库”中,则需要以下信息 :If the image is in a private repository, you need the following information:

    • 注册表地址 。The registry address. 例如,myregistry.azureecr.ioFor example, myregistry.azureecr.io.
    • 具有注册表读取权限的服务主体“用户名”和“密码” 。A service principal username and password that has read access to the registry.

    如果没有此信息,请与管理员联系,获取包含映像的 Azure 容器注册表。If you do not have this information, speak to the administrator for the Azure Container Registry that contains your image.

公开发布的基础映像Publicly available base images

Microsoft 在可公开访问的存储库中提供了多个 docker 映像,可按照本节中的步骤使用这些映像:Microsoft provides several docker images on a publicly accessible repository, which can be used with the steps in this section:

映像Image 说明Description
mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda Azure 机器学习的基础映像Basic image for Azure Machine Learning
mcr.microsoft.com/azureml/onnxruntime:latest 包含用于 CPU 推理的 ONNX 运行时Contains ONNX Runtime for CPU inferencing
mcr.microsoft.com/azureml/onnxruntime:latest-cuda 包含用于 GPU 的 ONNX 运行时和 CUDAContains the ONNX Runtime and CUDA for GPU
mcr.microsoft.com/azureml/onnxruntime:latest-tensorrt 包含用于 GPU 的 ONNX 运行时和 TensorRTContains ONNX Runtime and TensorRT for GPU
mcr.microsoft.com/azureml/onnxruntime:latest-openvino-vadm 包含用于基于 MovidiusTM MyriadX VPU 的 Intel Vision Accelerator Design 的 ONNX 运行时和 OpenVINOContains ONNX Runtime and OpenVINO for Intel Vision Accelerator Design based on MovidiusTM MyriadX VPUs
mcr.microsoft.com/azureml/onnxruntime:latest-openvino-myriad 包含用于 Intel MovidiusTM U 盘的 ONNX 运行时和 OpenVINOContains ONNX Runtime and OpenVINO for Intel MovidiusTM USB sticks

有关 ONNX 运行时基础映像的更多信息,请参阅 GitHub 存储库中的 ONNX 运行时 dockerfile 部分For more information about the ONNX Runtime base images see the ONNX Runtime dockerfile section in the GitHub repo.

Tip

由于这些映像是公开可用的,因此在使用它们时不需要提供地址、用户名或密码。Since these images are publicly available, you do not need to provide an address, username or password when using them.

有关详细信息,请参阅 Azure 机器学习容器For more information, see Azure Machine Learning containers.

Tip

如果在 Azure 机器学习计算上使用版本 1.0.22 或更高版本的 Azure 机器学习 SDK 训练模型,则会在训练期间创建一个映像 。If your model is trained on Azure Machine Learning Compute, using version 1.0.22 or greater of the Azure Machine Learning SDK, an image is created during training. 可使用 run.properties["AzureML.DerivedImageName"] 发现此映像的名称。To discover the name of this image, use run.properties["AzureML.DerivedImageName"]. 下面的示例演示如何使用此映像:The following example demonstrates how to use this image:

# Use an image built during training with SDK 1.0.22 or greater
image_config.base_image = run.properties["AzureML.DerivedImageName"]

将映像与 Azure 机器学习 SDK 结合使用Use an image with the Azure Machine Learning SDK

若要使用存储在工作区的 Azure 容器注册表中的映像,或使用存储在可公开访问的容器注册表中的映像,请设置以下环境属性 :To use an image stored in the Azure Container Registry for your workspace, or a container registry that is publicly accessible, set the following Environment attributes:

  • docker.enabled=True
  • docker.base_image:设置为注册表和映像的路径。docker.base_image: Set to the registry and path to the image.
from azureml.core.environment import Environment
# Create the environment
myenv = Environment(name="myenv")
# Enable Docker and reference an image
myenv.docker.enabled = True
myenv.docker.base_image = "mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda"

若要使用不在工作区中的专用容器注册表中的映像,必须使用 docker.base_image_registry 指定存储库的地址以及用户名和密码 :To use an image from a private container registry that is not in your workspace, you must use docker.base_image_registry to specify the address of the repository and a user name and password:

# Set the container registry information
myenv.docker.base_image_registry.address = "myregistry.azurecr.io"
myenv.docker.base_image_registry.username = "username"
myenv.docker.base_image_registry.password = "password"

myenv.inferencing_stack_version = "latest"  # This will install the inference specific apt packages.

# Define the packages needed by the model and scripts
from azureml.core.conda_dependencies import CondaDependencies
conda_dep = CondaDependencies()
# you must list azureml-defaults as a pip dependency
conda_dep.add_pip_package("azureml-defaults")
myenv.python.conda_dependencies=conda_dep

必须添加版本 >= 1.0.45 的 azureml-defaults 作为 pip 依赖项。You must add azureml-defaults with version >= 1.0.45 as a pip dependency. 此包包含将模型作为 Web 服务托管时所需的功能。This package contains the functionality needed to host the model as a web service. 还必须将环境的 inferencing_stack_version 属性设置为“latest”,这将安装 Web 服务所需的特定 apt 包。You must also set inferencing_stack_version property on the environment to "latest", this will install specific apt packages needed by web service.

定义环境后,将其与 InferenceConfig 对象一起使用,以定义模型和 Web 服务将在其中运行的推理环境。After defining the environment, use it with an InferenceConfig object to define the inference environment in which the model and web service will run.

from azureml.core.model import InferenceConfig
# Use environment in InferenceConfig
inference_config = InferenceConfig(entry_script="score.py",
                                   environment=myenv)

此时,你可以继续进行部署。At this point, you can continue with deployment. 例如,以下代码片段将使用推理配置和自定义映像在本地部署 Web 服务:For example, the following code snippet would deploy a web service locally using the inference configuration and custom image:

from azureml.core.webservice import LocalWebservice, Webservice

deployment_config = LocalWebservice.deploy_configuration(port=8890)
service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config)
service.wait_for_deployment(show_output = True)
print(service.state)

有关部署的详细信息,请参阅使用 Azure 机器学习部署模型For more information on deployment, see Deploy models with Azure Machine Learning.

有关自定义 Python 环境的详细信息,请参阅创建和管理用于训练和部署的环境For more information on customizing your Python environment, see Create and manage environments for training and deployment.

将映像与机器学习 CLI 结合使用Use an image with the Machine Learning CLI

Important

目前,机器学习 CLI 可以使用来自工作区的 Azure 容器注册表或可公开访问的存储库的映像。Currently the Machine Learning CLI can use images from the Azure Container Registry for your workspace or publicly accessible repositories. 而不能使用来自独立的专用注册表的映像。It cannot use images from standalone private registries.

使用机器学习 CLI 部署模型之前,请创建一个使用自定义映像的环境Before deploying a model using the Machine Learning CLI, create an environment that uses the custom image. 然后创建引用该环境的推理配置文件。Then create an inference configuration file that references the environment. 也可以在推理配置文件中直接定义环境。You can also define the environment directly in the inference configuration file. 下面的 JSON 文档演示了如何在公共容器注册表中引用映像。The following JSON document demonstrates how to reference an image in a public container registry. 在本示例中,以内联方式定义环境:In this example, the environment is defined inline:

{
    "entryScript": "score.py",
    "environment": {
        "docker": {
            "arguments": [],
            "baseDockerfile": null,
            "baseImage": "mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda",
            "enabled": false,
            "sharedVolumes": true,
            "shmSize": null
        },
        "environmentVariables": {
            "EXAMPLE_ENV_VAR": "EXAMPLE_VALUE"
        },
        "name": "my-deploy-env",
        "python": {
            "baseCondaEnvironment": null,
            "condaDependencies": {
                "channels": [
                    "conda-forge"
                ],
                "dependencies": [
                    "python=3.6.2",
                    {
                        "pip": [
                            "azureml-defaults",
                            "azureml-telemetry",
                            "scikit-learn",
                            "inference-schema[numpy-support]"
                        ]
                    }
                ],
                "name": "project_environment"
            },
            "condaDependenciesFile": null,
            "interpreterPath": "python",
            "userManagedDependencies": false
        },
        "version": "1"
    }
}

此文件与 az ml model deploy 命令一起使用。This file is used with the az ml model deploy command. --ic 参数用于指定推理配置文件。The --ic parameter is used to specify the inference configuration file.

az ml model deploy -n myservice -m mymodel:1 --ic inferenceconfig.json --dc deploymentconfig.json --ct akscomputetarget

有关使用 ML CLI 部署模型的详细信息,请参阅 Azure 机器学习的 CLI 扩展一文的“模型注册、分析和部署”部分。For more information on deploying a model using the ML CLI, see the "model registration, profiling, and deployment" section of the CLI extension for Azure Machine Learning article.

后续步骤Next steps