使用自定义 Docker 基础映像部署模型Deploy a model using a custom Docker base image

了解如何在使用 Azure 机器学习部署已训练的模型时使用自定义 Docker 基础映像。Learn how to use a custom Docker base image when deploying trained models with Azure Machine Learning.

如果未指定 Docker 基础映像,Azure 机器学习将使用默认的 Docker 基础映像。Azure Machine Learning will use a default base Docker image if none is specified. 可以找到与 azureml.core.runconfig.DEFAULT_CPU_IMAGE 一起使用的特定 Docker 映像。You can find the specific Docker image used with azureml.core.runconfig.DEFAULT_CPU_IMAGE. 你还可以使用 Azure 机器学习环境来选择特定的基础映像,或者使用自己的基础映像。You can also use Azure Machine Learning environments to select a specific base image, or use a custom one that you provide.

为部署创建映像时,可以从基础映像着手。A base image is used as the starting point when an image is created for a deployment. 基础映像提供基本的操作系统和组件。It provides the underlying operating system and components. 然后,部署过程会将其他组件(如模型、Conda 环境和其他资产)添加到该映像中。The deployment process then adds additional components, such as your model, conda environment, and other assets, to the image.

通常情况下,如果希望使用 Docker 来管理依赖项、对组件版本进行更严格的控制或节省部署时间,则可以创建自定义基础映像。Typically, you create a custom base image when you want to use Docker to manage your dependencies, maintain tighter control over component versions or save time during deployment. 你可能还希望安装模型所需的软件,而安装过程需要很长时间。You might also want to install software required by your model, where the installation process takes a long time. 如果在创建基础映像时安装软件,你就不必为每个部署安装它。Installing the software when creating the base image means that you don't have to install it for each deployment.

重要

部署模型时,不能覆盖核心组件,如 Web 服务器或 IoT Edge 组件。When you deploy a model, you cannot override core components such as the web server or IoT Edge components. 这些组件提供已知的工作环境并由 Microsoft 进行测试和支持。These components provide a known working environment that is tested and supported by Microsoft.

警告

Microsoft 可能无法帮助解决由自定义映像引起的问题。Microsoft may not be able to help troubleshoot problems caused by a custom image. 如果遇到问题,请使用默认映像或 Microsoft 提供的映像之一,了解该问题是否特定于你的映像。If you encounter problems, you may be asked to use the default image or one of the images Microsoft provides to see if the problem is specific to your image.

本文档分为两个部分:This document is broken into two sections:

  • 创建自定义基础映像:为管理员和 DevOps 提供有关使用 Azure CLI 和机器学习 CLI 创建自定义映像和配置 Azure 容器注册表的身份验证的信息。Create a custom base image: Provides information to admins and DevOps on creating a custom image and configuring authentication to an Azure Container Registry using the Azure CLI and Machine Learning CLI.
  • 使用自定义基础映像部署模型:向数据科学家和 DevOps/ML 工程师提供有关在从 Python SDK 或 ML CLI 部署定型模型时使用自定义映像的信息。Deploy a model using a custom base image: Provides information to Data Scientists and DevOps / ML Engineers on using custom images when deploying a trained model from the Python SDK or ML CLI.

先决条件Prerequisites

创建自定义基础映像Create a custom base image

本部分中的信息假设你正在使用 Azure 容器注册表存储 Docker 映像。The information in this section assumes that you are using an Azure Container Registry to store Docker images. 计划为 Azure 机器学习创建自定义映像时,请使用以下清单:Use the following checklist when planning to create custom images for Azure Machine Learning:

  • 你将使用为 Azure 机器学习工作区创建的 Azure 容器注册表,还是使用独立的 Azure 容器注册表?Will you use the Azure Container Registry created for the Azure Machine Learning workspace, or a standalone Azure Container Registry?

    使用存储在工作区的容器注册表中的映像时,不需要对注册表进行身份验证。When using images stored in the container registry for the workspace , you do not need to authenticate to the registry. 身份验证由工作区处理。Authentication is handled by the workspace.

    警告

    首次使用工作区训练或部署模型时,将创建工作区的 Azure 容器注册表。The Azure Container Registry for your workspace is created the first time you train or deploy a model using the workspace. 如果你创建了一个新的工作区,但没有训练或创建模型,则该工作区将不存在 Azure 容器注册表。If you've created a new workspace, but not trained or created a model, no Azure Container Registry will exist for the workspace.

    使用存储在“独立容器注册表”中的映像时,需要配置至少具有读取访问权限的服务主体。When using images stored in a standalone container registry , you will need to configure a service principal that has at least read access. 然后向使用注册表中的映像的任何人提供服务主体 ID(用户名)和密码。You then provide the service principal ID (username) and password to anyone that uses images from the registry. 但你使容器注册表可公开访问的情况例外。The exception is if you make the container registry publicly accessible.

    有关如何创建专用 Azure 容器注册表的信息,请参阅创建专用容器注册表For information on creating a private Azure Container Registry, see Create a private container registry.

    有关在 Azure 容器注册表中使用服务主体的信息,请参阅使用服务主体的 Azure 容器注册表身份验证For information on using service principals with Azure Container Registry, see Azure Container Registry authentication with service principals.

  • Azure 容器注册表和映像信息:请为需要使用映像的任何人提供映像名。Azure Container Registry and image information: Provide the image name to anyone that needs to use it. 例如,使用名为 myimage 的映像(存储在名为 myregistry 的注册表中)进行模型部署时,该映像将被引用为 myregistry.azurecr.io/myimageFor example, an image named myimage, stored in a registry named myregistry, is referenced as myregistry.azurecr.io/myimage when using the image for model deployment

图像要求Image requirements

Azure 机器学习仅支持提供以下软件的 Docker 映像:Azure Machine Learning only supports Docker images that provide the following software:

  • Ubuntu 16.04 或更高版本。Ubuntu 16.04 or greater.
  • Conda 4.5.# 或更高版本。Conda 4.5.# or greater.
  • Python 3.5+。Python 3.5+.

若要使用数据集,请安装 libfuse-dev 包。To use Datasets, please install the libfuse-dev package. 另外,请确保安装你可能需要的所有用户空间包。Also make sure to install any user space packages you may need.

Azure ML 会维护一组发布到 Microsoft 容器注册表的 CPU 和 GPU 基础映像,你可以选择利用(或引用)这些映像,而不是创建自己的自定义映像。Azure ML maintains a set of CPU and GPU base images published to Microsoft Container Registry that you can optionally leverage (or reference) instead of creating your own custom image. 若要查看这些映像的 Dockerfile,请参考 Azure/AzureML-Container GitHub 存储库。To see the Dockerfiles for those images, refer to the Azure/AzureML-Containers GitHub repository.

对于 GPU 映像,Azure ML 目前同时提供了 cuda9 和 cuda10 基础映像。For GPU images, Azure ML currently offers both cuda9 and cuda10 base images. 这些基础映像中安装的主要依赖项包括:The major dependencies installed in these base images are:

依赖项Dependencies IntelMPI CPUIntelMPI CPU OpenMPI CPUOpenMPI CPU IntelMPI GPUIntelMPI GPU OpenMPI GPUOpenMPI GPU
minicondaminiconda ==4.5.11==4.5.11 ==4.5.11==4.5.11 ==4.5.11==4.5.11 ==4.5.11==4.5.11
mpimpi intelmpi==2018.3.222intelmpi==2018.3.222 openmpi==3.1.2openmpi==3.1.2 intelmpi==2018.3.222intelmpi==2018.3.222 openmpi==3.1.2openmpi==3.1.2
cudacuda - - 9.0/10.09.0/10.0 9.0/10.0/10.19.0/10.0/10.1
cudnncudnn - - 7.4/7.57.4/7.5 7.4/7.57.4/7.5
ncclnccl - - 2.42.4 2.42.4
gitgit 2.7.42.7.4 2.7.42.7.4 2.7.42.7.4 2.7.42.7.4

CPU 映像从 ubuntu16.04 生成。The CPU images are built from ubuntu16.04. cuda9 的 GPU 映像从 nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04 生成。The GPU images for cuda9 are built from nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04. cuda10 的 GPU 映像从 nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04 生成。The GPU images for cuda10 are built from nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04.

获取容器注册表信息Get container registry information

本部分介绍如何获取 Azure 机器学习工作区的 Azure 容器注册表的名称。In this section, learn how to get the name of the Azure Container Registry for your Azure Machine Learning workspace.

警告

首次使用工作区训练或部署模型时,将创建工作区的 Azure 容器注册表。The Azure Container Registry for your workspace is created the first time you train or deploy a model using the workspace. 如果你创建了一个新的工作区,但没有训练或创建模型,则该工作区将不存在 Azure 容器注册表。If you've created a new workspace, but not trained or created a model, no Azure Container Registry will exist for the workspace.

如果已使用 Azure 机器学习训练或部署了模型,则会为你的工作区创建容器注册表。If you've already trained or deployed models using Azure Machine Learning, a container registry was created for your workspace. 若要查找此容器注册表的名称,请使用以下步骤:To find the name of this container registry, use the following steps:

  1. 打开新的 shell 或命令提示符,并使用以下命令对 Azure 订阅进行身份验证:Open a new shell or command-prompt and use the following command to authenticate to your Azure subscription:

    az login
    

    按照提示对订阅进行身份验证。Follow the prompts to authenticate to the subscription.

    提示

    登录后,你将看到与你的 Azure 帐户关联的订阅列表。After logging in, you see a list of subscriptions associated with your Azure account. isDefault: true 的订阅信息是当前已激活的 Azure CLI 命令订阅。The subscription information with isDefault: true is the currently activated subscription for Azure CLI commands. 此订阅必须与包含 Azure 机器学习工作区的订阅相同。This subscription must be the same one that contains your Azure Machine Learning workspace. 通过访问工作区的概述页,可以从 Azure 门户中找到订阅 ID。You can find the subscription ID from the Azure portal by visiting the overview page for your workspace. 还可以使用 SDK 从工作区对象获取订阅 ID。You can also use the SDK to get the subscription ID from the workspace object. 例如,Workspace.from_config().subscription_idFor example, Workspace.from_config().subscription_id.

    若要选择另一个订阅,请使用 az account set -s <subscription name or ID> 命令,并指定要切换到的订阅名称或 ID。To select another subscription, use the az account set -s <subscription name or ID> command and specify the subscription name or ID to switch to. 有关订阅选择的详细信息,请参阅使用多个 Azure 订阅For more information about subscription selection, see Use multiple Azure Subscriptions.

  2. 使用以下命令列出工作区的容器注册表。Use the following command to list the container registry for the workspace. <myworkspace> 替换为 Azure 机器学习工作区名称。Replace <myworkspace> with your Azure Machine Learning workspace name. <resourcegroup> 替换为包含工作区的 Azure 资源组:Replace <resourcegroup> with the Azure resource group that contains your workspace:

    az ml workspace show -w <myworkspace> -g <resourcegroup> --query containerRegistry
    

    提示

    如果收到一条错误消息,指出未安装 ml 扩展,请使用以下命令进行安装:If you get an error message stating that the ml extension isn't installed, use the following command to install it:

    az extension add -n azure-cli-ml
    

    返回的信息类似于下文:The information returned is similar to the following text:

    /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.ContainerRegistry/registries/<registry_name>
    

    <registry_name> 值是工作区的 Azure 容器注册表的名称。The <registry_name> value is the name of the Azure Container Registry for your workspace.

生成自定义基础映像Build a custom base image

本部分中的步骤将介绍如何在 Azure 容器注册表中创建自定义 Docker 映像。The steps in this section walk-through creating a custom Docker image in your Azure Container Registry. 有关示例 dockerfile,请参阅 Azure/AzureML-Containers GitHub 存储库。For sample dockerfiles, see the Azure/AzureML-Containers GitHub repo).

  1. 创建名为 Dockerfile 的新文本文件,并将以下文本用作内容:Create a new text file named Dockerfile, and use the following text as the contents:

    FROM ubuntu:16.04
    
    ARG CONDA_VERSION=4.7.12
    ARG PYTHON_VERSION=3.7
    ARG AZUREML_SDK_VERSION=1.13.0
    ARG INFERENCE_SCHEMA_VERSION=1.1.0
    
    ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
    ENV PATH /opt/miniconda/bin:$PATH
    ENV DEBIAN_FRONTEND=noninteractive
    
    RUN apt-get update --fix-missing && \
        apt-get install -y wget bzip2 && \
        apt-get install -y fuse && \
        apt-get clean -y && \
        rm -rf /var/lib/apt/lists/*
    
    RUN useradd --create-home dockeruser
    WORKDIR /home/dockeruser
    USER dockeruser
    
    RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh -O ~/miniconda.sh && \
        /bin/bash ~/miniconda.sh -b -p ~/miniconda && \
        rm ~/miniconda.sh && \
        ~/miniconda/bin/conda clean -tipsy
    ENV PATH="/home/dockeruser/miniconda/bin/:${PATH}"
    
    RUN conda install -y conda=${CONDA_VERSION} python=${PYTHON_VERSION} && \
        pip install azureml-defaults==${AZUREML_SDK_VERSION} inference-schema==${INFERENCE_SCHEMA_VERSION} &&\
        conda clean -aqy && \
        rm -rf ~/miniconda/pkgs && \
        find ~/miniconda/ -type d -name pycache -prune -exec rm -rf {} \;
    
  2. 在 shell 或命令提示符下,使用以下命令对 Azure 容器注册表进行身份验证。From a shell or command-prompt, use the following to authenticate to the Azure Container Registry. <registry_name> 替换为要将映像存储在其中的容器注册表的名称:Replace the <registry_name> with the name of the container registry you want to store the image in:

    az acr login --name <registry_name>
    
  3. 若要上传 Dockerfile 并生成它,请使用以下命令。To upload the Dockerfile, and build it, use the following command. <registry_name> 替换为要将映像存储在其中的容器注册表的名称:Replace <registry_name> with the name of the container registry you want to store the image in:

    az acr build --image myimage:v1 --registry <registry_name> --file Dockerfile .
    

    提示

    在此示例中,标记 :v1 将应用于映像。In this example, a tag of :v1 is applied to the image. 如果未提供标记,则应用标记 :latestIf no tag is provided, a tag of :latest is applied.

    在生成过程中,信息将被流式传输回命令行。During the build process, information is streamed to back to the command line. 如果生成成功,你将收到类似于以下文本的消息:If the build is successful, you receive a message similar to the following text:

    Run ID: cda was successful after 2m56s
    

如需深入了解如何使用 Azure 容器注册表生成映像,请参阅使用 Azure 容器注册表任务生成和运行容器映像For more information on building images with an Azure Container Registry, see Build and run a container image using Azure Container Registry Tasks

如需深入了解如何将现有映像上传到 Azure 容器注册表,请参阅将首个映像推送到专用 Docker 容器注册表For more information on uploading existing images to an Azure Container Registry, see Push your first image to a private Docker container registry.

使用自定义基础映像Use a custom base image

若要使用自定义映像,需要以下信息:To use a custom image, you need the following information:

  • 映像名称。The image name . 例如,mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda:latest 是 Microsoft 提供的简单 Docker 映像的路径。For example, mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda:latest is the path to a simple Docker Image provided by Microsoft.

    重要

    对于已创建的自定义映像,请确保包含用于该映像的任何标记。For custom images that you've created, be sure to include any tags that were used with the image. 例如,如果映像是使用特定标记(如 :v1)创建的。For example, if your image was created with a specific tag, such as :v1. 如果创建映像时未使用特定标记,则应用标记 :latestIf you did not use a specific tag when creating the image, a tag of :latest was applied.

  • 如果映像位于“专用存储库”中,则需要以下信息:If the image is in a private repository , you need the following information:

    • 注册表地址。The registry address . 例如,myregistry.azureecr.ioFor example, myregistry.azureecr.io.
    • 具有注册表读取权限的服务主体“用户名”和“密码” 。A service principal username and password that has read access to the registry.

    如果没有此信息,请与管理员联系,获取包含映像的 Azure 容器注册表。If you do not have this information, speak to the administrator for the Azure Container Registry that contains your image.

公开发布的基础映像Publicly available base images

Microsoft 在可公开访问的存储库中提供了多个 docker 映像,可按照本节中的步骤使用这些映像:Microsoft provides several docker images on a publicly accessible repository, which can be used with the steps in this section:

映像Image 说明Description
mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda Azure 机器学习的核心映像Core image for Azure Machine Learning
mcr.microsoft.com/azureml/onnxruntime:latest 包含用于 CPU 推理的 ONNX 运行时Contains ONNX Runtime for CPU inferencing
mcr.microsoft.com/azureml/onnxruntime:latest-cuda 包含用于 GPU 的 ONNX 运行时和 CUDAContains the ONNX Runtime and CUDA for GPU
mcr.microsoft.com/azureml/onnxruntime:latest-tensorrt 包含用于 GPU 的 ONNX 运行时和 TensorRTContains ONNX Runtime and TensorRT for GPU
mcr.microsoft.com/azureml/onnxruntime:latest-openvino-vadm 包含用于基于 MovidiusTM MyriadX VPU 的 Intel Vision Accelerator Design 的 ONNX 运行时和 OpenVINOContains ONNX Runtime and OpenVINO for Intel Vision Accelerator Design based on MovidiusTM MyriadX VPUs
mcr.microsoft.com/azureml/onnxruntime:latest-openvino-myriad 包含用于 Intel MovidiusTM U 盘的 ONNX 运行时和 OpenVINOContains ONNX Runtime and OpenVINO for Intel MovidiusTM USB sticks

有关 ONNX 运行时基础映像的更多信息,请参阅 GitHub 存储库中的 ONNX 运行时 dockerfile 部分For more information about the ONNX Runtime base images see the ONNX Runtime dockerfile section in the GitHub repo.

提示

由于这些映像是公开可用的,因此在使用它们时不需要提供地址、用户名或密码。Since these images are publicly available, you do not need to provide an address, username or password when using them.

有关详细信息,请参阅 GitHub 上的 Azure 机器学习容器存储库。For more information, see Azure Machine Learning containers repository on GitHub.

将映像与 Azure 机器学习 SDK 结合使用Use an image with the Azure Machine Learning SDK

若要使用存储在工作区的 Azure 容器注册表中的映像,或使用存储在可公开访问的容器注册表中的映像,请设置以下环境属性 :To use an image stored in the Azure Container Registry for your workspace , or a container registry that is publicly accessible , set the following Environment attributes:

  • docker.enabled=True
  • docker.base_image:设置为注册表和映像的路径。docker.base_image: Set to the registry and path to the image.
from azureml.core.environment import Environment
# Create the environment
myenv = Environment(name="myenv")
# Enable Docker and reference an image
myenv.docker.enabled = True
myenv.docker.base_image = "mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda:latest"

若要使用不在工作区中的专用容器注册表中的映像,必须使用 docker.base_image_registry 指定存储库的地址以及用户名和密码:To use an image from a private container registry that is not in your workspace, you must use docker.base_image_registry to specify the address of the repository and a user name and password:

# Set the container registry information
myenv.docker.base_image_registry.address = "myregistry.azurecr.io"
myenv.docker.base_image_registry.username = "username"
myenv.docker.base_image_registry.password = "password"

myenv.inferencing_stack_version = "latest"  # This will install the inference specific apt packages.

# Define the packages needed by the model and scripts
from azureml.core.conda_dependencies import CondaDependencies
conda_dep = CondaDependencies()
# you must list azureml-defaults as a pip dependency
conda_dep.add_pip_package("azureml-defaults")
myenv.python.conda_dependencies=conda_dep

必须添加版本 >= 1.0.45 的 azureml-defaults 作为 pip 依赖项。You must add azureml-defaults with version >= 1.0.45 as a pip dependency. 此包包含将模型作为 Web 服务托管时所需的功能。This package contains the functionality needed to host the model as a web service. 还必须将环境的 inferencing_stack_version 属性设置为“latest”,这将安装 Web 服务所需的特定 apt 包。You must also set inferencing_stack_version property on the environment to "latest", this will install specific apt packages needed by web service.

定义环境后,将其与 InferenceConfig 对象一起使用,以定义模型和 Web 服务将在其中运行的推理环境。After defining the environment, use it with an InferenceConfig object to define the inference environment in which the model and web service will run.

from azureml.core.model import InferenceConfig
# Use environment in InferenceConfig
inference_config = InferenceConfig(entry_script="score.py",
                                   environment=myenv)

此时,你可以继续进行部署。At this point, you can continue with deployment. 例如,以下代码片段将使用推理配置和自定义映像在本地部署 Web 服务:For example, the following code snippet would deploy a web service locally using the inference configuration and custom image:

from azureml.core.webservice import LocalWebservice, Webservice

deployment_config = LocalWebservice.deploy_configuration(port=8890)
service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config)
service.wait_for_deployment(show_output = True)
print(service.state)

有关部署的详细信息,请参阅使用 Azure 机器学习部署模型For more information on deployment, see Deploy models with Azure Machine Learning.

有关自定义 Python 环境的详细信息,请参阅创建和管理用于训练和部署的环境For more information on customizing your Python environment, see Create and manage environments for training and deployment.

将映像与机器学习 CLI 结合使用Use an image with the Machine Learning CLI

重要

目前,机器学习 CLI 可以使用来自工作区的 Azure 容器注册表或可公开访问的存储库的映像。Currently the Machine Learning CLI can use images from the Azure Container Registry for your workspace or publicly accessible repositories. 而不能使用来自独立的专用注册表的映像。It cannot use images from standalone private registries.

使用机器学习 CLI 部署模型之前,请创建一个使用自定义映像的环境Before deploying a model using the Machine Learning CLI, create an environment that uses the custom image. 然后创建引用该环境的推理配置文件。Then create an inference configuration file that references the environment. 也可以在推理配置文件中直接定义环境。You can also define the environment directly in the inference configuration file. 下面的 JSON 文档演示了如何在公共容器注册表中引用映像。The following JSON document demonstrates how to reference an image in a public container registry. 在本示例中,以内联方式定义环境:In this example, the environment is defined inline:

{
    "entryScript": "score.py",
    "environment": {
        "docker": {
            "arguments": [],
            "baseDockerfile": null,
            "baseImage": "mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda:latest",
            "enabled": false,
            "sharedVolumes": true,
            "shmSize": null
        },
        "environmentVariables": {
            "EXAMPLE_ENV_VAR": "EXAMPLE_VALUE"
        },
        "name": "my-deploy-env",
        "python": {
            "baseCondaEnvironment": null,
            "condaDependencies": {
                "channels": [
                    "conda-forge"
                ],
                "dependencies": [
                    "python=3.6.2",
                    {
                        "pip": [
                            "azureml-defaults",
                            "azureml-telemetry",
                            "scikit-learn",
                            "inference-schema[numpy-support]"
                        ]
                    }
                ],
                "name": "project_environment"
            },
            "condaDependenciesFile": null,
            "interpreterPath": "python",
            "userManagedDependencies": false
        },
        "version": "1"
    }
}

此文件与 az ml model deploy 命令一起使用。This file is used with the az ml model deploy command. --ic 参数用于指定推理配置文件。The --ic parameter is used to specify the inference configuration file.

az ml model deploy -n myservice -m mymodel:1 --ic inferenceconfig.json --dc deploymentconfig.json --ct akscomputetarget

有关使用 ML CLI 部署模型的详细信息,请参阅 Azure 机器学习的 CLI 扩展一文的“模型注册、分析和部署”部分。For more information on deploying a model using the ML CLI, see the "model registration, profiling, and deployment" section of the CLI extension for Azure Machine Learning article.

后续步骤Next steps