将模型部署到 Azure Kubernetes 服务群集Deploy a model to an Azure Kubernetes Service cluster

适用于:是基本版是企业版               (升级到企业版APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

了解如何使用 Azure 机器学习将模型部署为 Azure Kubernetes 服务 (AKS) 中的 Web 服务。Learn how to use Azure Machine Learning to deploy a model as a web service on Azure Kubernetes Service (AKS). Azure Kubernetes 服务适用于大规模的生产部署。Azure Kubernetes Service is good for high-scale production deployments. 如果需要以下一项或多项功能,请使用 Azure Kubernetes 服务:Use Azure Kubernetes service if you need one or more of the following capabilities:

  • 快速响应时间 。Fast response time.
  • 自动缩放已部署的服务 。Autoscaling of the deployed service.
  • 硬件加速选项,如 GPU 和现场可编程门阵列 (FPGA) 。Hardware acceleration options such as GPU and field-programmable gate arrays (FPGA).

Important

群集缩放并非通过 Azure 机器学习 SDK 提供。Cluster scaling is not provided through the Azure Machine Learning SDK. 如需深入了解如何缩放 AKS 群集中的节点,请参阅缩放 AKS 群集中的节点计数For more information on scaling the nodes in an AKS cluster, see Scale the node count in an AKS cluster.

部署到 Azure Kubernetes 服务时,将部署到连接到工作区的 AKS 群集 。When deploying to Azure Kubernetes Service, you deploy to an AKS cluster that is connected to your workspace. 有两种方法可将 AKS 群集连接到工作区:There are two ways to connect an AKS cluster to your workspace:

  • 使用 Azure 机器学习 SDK、机器学习 CLI 或 Azure 机器学习工作室创建 AKS 群集。Create the AKS cluster using the Azure Machine Learning SDK, the Machine Learning CLI, or Azure Machine Learning studio. 此过程会自动将群集连接到工作区。This process automatically connects the cluster to the workspace.
  • 将现有的 AKS 群集附加到 Azure 机器学习工作区。Attach an existing AKS cluster to your Azure Machine Learning workspace. 可使用 Azure 机器学习 SDK、机器学习 CLI 或 Azure 机器学习工作室来附加群集。A cluster can be attached using the Azure Machine Learning SDK, Machine Learning CLI, or Azure Machine Learning studio.

Important

创建或附加过程是一次性任务。The creation or attachment process is a one time task. 将 AKS 群集连接到工作区后,便可将其用于部署。Once an AKS cluster is connected to the workspace, you can use it for deployments. 如果不再需要 AKS 群集,可将其拆离或删除。You can detach or delete the AKS cluster if you no longer need it. 拆离或删除后,将无法再部署到该群集。Once detached or deleted, you will no longer be able to deploy to the cluster.

先决条件Prerequisites

创建新的 AKS 群集Create a new AKS cluster

时间估计:大约 20 分钟。Time estimate: Approximately 20 minutes.

对于工作区而言,创建或附加 AKS 群集是一次性过程。Creating or attaching an AKS cluster is a one time process for your workspace. 可以将此群集重复用于多个部署。You can reuse this cluster for multiple deployments. 如果删除该群集或包含该群集的资源组,则在下次需要进行部署时必须创建新群集。If you delete the cluster or the resource group that contains it, you must create a new cluster the next time you need to deploy. 可将多个 AKS 群集附加到工作区。You can have multiple AKS clusters attached to your workspace.

Tip

如果要使用 Azure 虚拟网络保护 AKS 群集,则必须先创建虚拟网络。If you want to secure your AKS cluster using an Azure Virtual Network, you must create the virtual network first. 有关详细信息,请参阅 Azure 虚拟网络中的安全试验和推理For more information, see Secure experimentation and inference with Azure Virtual Network.

如果要创建 AKS 群集以用于开发、验证和测试而非生产,则可以将“群集用途”指定为“开发测试” 。If you want to create an AKS cluster for development, validation, and testing instead of production, you can specify the cluster purpose to dev test.

Warning

如果设置了 cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST,则所创建的群集不适用于生产级别的流量,并且可能会增加推理时间。If you set cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST, the cluster that is created is not suitable for production level traffic and may increase inference times. 开发/测试群集也不保证容错能力。Dev/test clusters also do not guarantee fault tolerance. 对于开发/测试群集,建议至少拥有 2 个虚拟 CPU。We recommend at least 2 virtual CPUs for dev/test clusters.

以下示例演示如何使用 SDK 和 CLI 创建新的 AKS 群集:The following examples demonstrate how to create a new AKS cluster using the SDK and CLI:

使用 SDKUsing the SDK

from azureml.core.compute import AksCompute, ComputeTarget

# Use the default configuration (you can also provide parameters to customize this).
# For example, to create a dev/test cluster, use:
# prov_config = AksCompute.provisioning_configuration(cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST)
prov_config = AksCompute.provisioning_configuration()

aks_name = 'myaks'
# Create the cluster
aks_target = ComputeTarget.create(workspace = ws,
                                    name = aks_name,
                                    provisioning_configuration = prov_config)

# Wait for the create process to complete
aks_target.wait_for_completion(show_output = True)

Important

对于 provisioning_configuration(),如果为 agent_countvm_size 选择自定义值,并且 cluster_purpose 不是 DEV_TEST,则需要确保 agent_count 乘以 vm_size 的结果大于或等于 12 个虚拟 CPU。For provisioning_configuration(), if you pick custom values for agent_count and vm_size, and cluster_purpose is not DEV_TEST, then you need to make sure agent_count multiplied by vm_size is greater than or equal to 12 virtual CPUs. 例如,如果对 vm_size 使用“Standard_D3_v2”(拥有 4 个虚拟 CPU),则应该为 agent_count 选择 3 或更大的数字。For example, if you use a vm_size of "Standard_D3_v2", which has 4 virtual CPUs, then you should pick an agent_count of 3 or greater.

Azure 机器学习 SDK 不支持缩放 AKS 群集。The Azure Machine Learning SDK does not provide support scaling an AKS cluster. 要缩放群集中的节点,请在 Azure 机器学习工作室中使用 AKS 群集的 UI。To scale the nodes in the cluster, use the UI for your AKS cluster in the Azure Machine Learning studio. 只能更改节点计数,不能更改群集的 VM 大小。You can only change the node count, not the VM size of the cluster.

有关此示例中使用的类、方法和参数的详细信息,请参阅以下参考文档:For more information on the classes, methods, and parameters used in this example, see the following reference documents:

使用 CLIUsing the CLI

az ml computetarget create aks -n myaks

有关详细信息,请参阅 az ml computetarget create aks 参考文档。For more information, see the az ml computetarget create aks reference.

附加现有的 AKS 群集Attach an existing AKS cluster

时间估计 :大约 5 分钟。Time estimate: Approximately 5 minutes.

如果 Azure 订阅中已有 AKS 群集并且其版本为 1.17 或更低版本,则可以使用该群集来部署映像。If you already have AKS cluster in your Azure subscription, and it is version 1.17 or lower, you can use it to deploy your image.

Tip

现有的 AKS 群集除了位于 Azure 机器学习工作区,还可位于 Azure 区域中。The existing AKS cluster can be in a Azure region other than your Azure Machine Learning workspace.

如果要使用 Azure 虚拟网络保护 AKS 群集,则必须先创建虚拟网络。If you want to secure your AKS cluster using an Azure Virtual Network, you must create the virtual network first. 有关详细信息,请参阅 Azure 虚拟网络中的安全试验和推理For more information, see Secure experimentation and inference with Azure Virtual Network.

将 AKS 群集附加到工作区时,可以通过设置 cluster_purpose 参数来定义使用群集的方式。When attaching an AKS cluster to a workspace, you can define how you will use the cluster by setting the cluster_purpose parameter.

如果未设置 cluster_purpose 参数或设置了 cluster_purpose = AksCompute.ClusterPurpose.FAST_PROD,则群集必须至少具有 12 个可用的虚拟 CPU。If you do not set the cluster_purpose parameter, or set cluster_purpose = AksCompute.ClusterPurpose.FAST_PROD, then the cluster must have at least 12 virtual CPUs available.

如果设置了 cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST,则群集不必具有 12 个虚拟 CPU。If you set cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST, then the cluster does not need to have 12 virtual CPUs. 对于开发/测试,建议至少具有 2 个虚拟 CPU。We recommend at least 2 virtual CPUs for dev/test. 但是,针对开发/测试配置的群集不适用于生产级别的流量,并且可能会增加推理时间。However a cluster that is configured for dev/test is not suitable for production level traffic and may increase inference times. 开发/测试群集也不保证容错能力。Dev/test clusters also do not guarantee fault tolerance.

Warning

请勿在工作区中为同一 AKS 群集创建多个同步附件。Do not create multiple, simultaneous attachments to the same AKS cluster from your workspace. 例如,使用两个不同的名称将一个 AKS 群集附加到工作区。For example, attaching one AKS cluster to a workspace using two different names. 每个新附件都会破坏先前存在的附件。Each new attachment will break the previous existing attachment(s).

如果要重新附加 AKS 群集(例如,更改 SSL 或其他群集配置设置),则必须先使用 AksCompute.detach() 删除现有附件。If you want to re-attach an AKS cluster, for example to change SSL or other cluster configuration setting, you must first remove the existing attachment by using AksCompute.detach().

有关如何使用 Azure CLI 或门户创建 AKS 群集的详细信息,请参阅以下文章:For more information on creating an AKS cluster using the Azure CLI or portal, see the following articles:

以下示例演示如何将现有 AKS 群集附加到工作区:The following examples demonstrate how to attach an existing AKS cluster to your workspace:

使用 SDKUsing the SDK

from azureml.core.compute import AksCompute, ComputeTarget
# Set the resource group that contains the AKS cluster and the cluster name
resource_group = 'myresourcegroup'
cluster_name = 'myexistingcluster'

# Attach the cluster to your workgroup. If the cluster has less than 12 virtual CPUs, use the following instead:
# attach_config = AksCompute.attach_configuration(resource_group = resource_group,
#                                         cluster_name = cluster_name,
#                                         cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST)
attach_config = AksCompute.attach_configuration(resource_group = resource_group,
                                         cluster_name = cluster_name)
aks_target = ComputeTarget.attach(ws, 'myaks', attach_config)

有关此示例中使用的类、方法和参数的详细信息,请参阅以下参考文档:For more information on the classes, methods, and parameters used in this example, see the following reference documents:

使用 CLIUsing the CLI

要使用 CLI 附加现有群集,需要获取现有群集的资源 ID。To attach an existing cluster using the CLI, you need to get the resource ID of the existing cluster. 请使用以下命令要获取该值。To get this value, use the following command. myexistingcluster 替换为 AKS 群集的名称。Replace myexistingcluster with the name of your AKS cluster. myresourcegroup 替换为包含该群集的资源组:Replace myresourcegroup with the resource group that contains the cluster:

az aks show -n myexistingcluster -g myresourcegroup --query id

此命令返回类似于以下文本的值:This command returns a value similar to the following text:

/subscriptions/{GUID}/resourcegroups/{myresourcegroup}/providers/Microsoft.ContainerService/managedClusters/{myexistingcluster}

要将现有群集附加到工作区,请使用以下命令。To attach the existing cluster to your workspace, use the following command. aksresourceid 替换为上一命令返回的值。Replace aksresourceid with the value returned by the previous command. myresourcegroup 替换为包含工作区的资源组。Replace myresourcegroup with the resource group that contains your workspace. myworkspace 替换为工作区名称。Replace myworkspace with your workspace name.

az ml computetarget attach aks -n myaks -i aksresourceid -g myresourcegroup -w myworkspace

有关详细信息,请参阅 az ml computetarget attach aks 参考文档。For more information, see the az ml computetarget attach aks reference.

部署到 AKSDeploy to AKS

要将模型部署到 Azure Kubernetes 服务,请创建一个描述所需计算资源的部署配置 。To deploy a model to Azure Kubernetes Service, create a deployment configuration that describes the compute resources needed. 例如,核心和内存的数量。For example, number of cores and memory. 此外,还需要一个推理配置,描述托管模型和 Web 服务所需的环境 。You also need an inference configuration, which describes the environment needed to host the model and web service. 有关如何创建推理配置的详细信息,请参阅部署模型的方式和位置For more information on creating the inference configuration, see How and where to deploy models.

使用 SDKUsing the SDK

from azureml.core.webservice import AksWebservice, Webservice
from azureml.core.model import Model

aks_target = AksCompute(ws,"myaks")
# If deploying to a cluster configured for dev/test, ensure that it was created with enough
# cores and memory to handle this deployment configuration. Note that memory is also used by
# things such as dependencies and AML components.
deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config, aks_target)
service.wait_for_deployment(show_output = True)
print(service.state)
print(service.get_logs())

有关此示例中使用的类、方法和参数的详细信息,请参阅以下参考文档:For more information on the classes, methods, and parameters used in this example, see the following reference documents:

使用 CLIUsing the CLI

要使用 CLI 进行部署,请使用以下命令。To deploy using the CLI, use the following command. myaks 替换为 AKS 计算目标的名称。Replace myaks with the name of the AKS compute target. mymodel:1 替换为注册的模型的名称和版本。Replace mymodel:1 with the name and version of the registered model. myservice 替换为要赋予此服务的名称:Replace myservice with the name to give this service:

az ml model deploy -ct myaks -m mymodel:1 -n myservice -ic inferenceconfig.json -dc deploymentconfig.json

deploymentconfig.json 文档中的条目对应于 AksWebservice.deploy_configuration 的参数。The entries in the deploymentconfig.json document map to the parameters for AksWebservice.deploy_configuration. 下表描述了 JSON 文档中的实体与方法参数之间的映射:The following table describes the mapping between the entities in the JSON document and the parameters for the method:

JSON 实体JSON entity 方法参数Method parameter 说明Description
computeType 不可用NA 计算目标。The compute target. 对于 AKS,此值必须为 aksFor AKS, the value must be aks.
autoScaler 不可用NA 包含自动缩放的配置元素。Contains configuration elements for autoscale. 请参阅自动缩放程序表。See the autoscaler table.
  autoscaleEnabled autoscale_enabled 是否为 Web 服务启用自动缩放。Whether to enable autoscaling for the web service. 如果 numReplicas = 0,则为 True;否则为 FalseIf numReplicas = 0, True; otherwise, False.
  minReplicas autoscale_min_replicas 自动缩放此 Web 服务时可使用的容器的最小数目。The minimum number of containers to use when autoscaling this web service. 默认值为 1Default, 1.
  maxReplicas autoscale_max_replicas 自动缩放此 Web 服务时可使用的容器的最大数目。The maximum number of containers to use when autoscaling this web service. 默认值为 10Default, 10.
  refreshPeriodInSeconds autoscale_refresh_seconds 自动缩放程序尝试缩放此 Web 服务的频率。How often the autoscaler attempts to scale this web service. 默认值为 1Default, 1.
  targetUtilization autoscale_target_utilization 自动缩放程序应尝试维持的此 Web 服务的目标利用率(以低于 100 的百分比表示)。The target utilization (in percent out of 100) that the autoscaler should attempt to maintain for this web service. 默认值为 70Default, 70.
dataCollection 不可用NA 包含数据集合的配置元素。Contains configuration elements for data collection.
  storageEnabled collect_model_data 是否为 Web 服务启用模型数据收集。Whether to enable model data collection for the web service. 默认值为 FalseDefault, False.
authEnabled auth_enabled 是否为 Web 服务启用密钥身份验证。Whether or not to enable key authentication for the web service. tokenAuthEnabledauthEnabled 均不能为 TrueBoth tokenAuthEnabled and authEnabled cannot be True. 默认值为 TrueDefault, True.
tokenAuthEnabled token_auth_enabled 是否为 Web 服务启用令牌身份验证。Whether or not to enable token authentication for the web service. tokenAuthEnabledauthEnabled 均不能为 TrueBoth tokenAuthEnabled and authEnabled cannot be True. 默认值为 FalseDefault, False.
containerResourceRequirements 不可用NA CPU 和内存实体的容器。Container for the CPU and memory entities.
  cpu cpu_cores 要分配给此 Web 服务的 CPU 核心数。The number of CPU cores to allocate for this web service. 默认值为 0.1Defaults, 0.1
  memoryInGB memory_gb 为此 Web 服务分配的内存量 (GB)。The amount of memory (in GB) to allocate for this web service. 默认值为 0.5Default, 0.5
appInsightsEnabled enable_app_insights 是否为 Web 服务启用 Application Insights 日志记录。Whether to enable Application Insights logging for the web service. 默认值为 FalseDefault, False.
scoringTimeoutMs scoring_timeout_ms 对 Web 服务调用的评分强制执行的超时时间。A timeout to enforce for scoring calls to the web service. 默认值为 60000Default, 60000.
maxConcurrentRequestsPerContainer replica_max_concurrent_requests 此 Web 服务每个节点的最大并发请求数。The maximum concurrent requests per node for this web service. 默认值为 1Default, 1.
maxQueueWaitMs max_request_wait_time 在返回 503 错误之前,请求在队列中停留的最长时间(毫秒)。The maximum time a request will stay in thee queue (in milliseconds) before a 503 error is returned. 默认值为 500Default, 500.
numReplicas num_replicas 要分配给此 Web 服务的容器数量。The number of containers to allocate for this web service. 没有默认值。No default value. 如果未设置此参数,则默认启用自动缩放程序。If this parameter is not set, the autoscaler is enabled by default.
keys 不可用NA 包含密钥的配置元素。Contains configuration elements for keys.
  primaryKey primary_key 要用于此 Web 服务的主要身份验证密钥A primary auth key to use for this Webservice
  secondaryKey secondary_key 要用于此 Web 服务的辅助身份验证密钥A secondary auth key to use for this Webservice
gpuCores gpu_cores 要分配给此 Web 服务的 GPU 核心数。The number of GPU cores to allocate for this Webservice. 默认值为 1。Default is 1. 仅支持整数值。Only supports whole number values.
livenessProbeRequirements 不可用NA 包含运行情况探测要求的配置元素。Contains configuration elements for liveness probe requirements.
  periodSeconds period_seconds 执行运行情况探测的频率(秒)。How often (in seconds) to perform the liveness probe. 默认值为 10 秒。Default to 10 seconds. 最小值为 1。Minimum value is 1.
  initialDelaySeconds initial_delay_seconds 启动容器后,启动运行情况探测前的秒数。Number of seconds after the container has started before liveness probes are initiated. 默认值为 310Defaults to 310
  timeoutSeconds timeout_seconds 运行情况探测超时前等待的秒数。默认值为 2 秒。Number of seconds after which the liveness probe times out. Defaults to 2 seconds. 最小值为 1Minimum value is 1
  successThreshold success_threshold 运行情况探测失败后,将其视为成功所需的最小连续成功次数。Minimum consecutive successes for the liveness probe to be considered successful after having failed. 默认值为 1。Defaults to 1. 最小值为 1。Minimum value is 1.
  failureThreshold failure_threshold 当 Pod 启动而运行情况探测失败时,Kubernetes 将尝试 failureThreshold 次才会放弃。When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. 默认值为 3。Defaults to 3. 最小值为 1。Minimum value is 1.
namespace namespace 将 Web 服务部署到的 Kubernetes 命名空间。The Kubernetes namespace that the webservice is deployed into. 最多 63 个字符,可使用小写字母数字字符(“a”-“z”,“0”-“9”)和连字符(“-”)。Up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. 第一个和最后一个字符不能为连字符。The first and last characters can't be hyphens.

以下 JSON 是用于 CLI 的部署配置示例:The following JSON is an example deployment configuration for use with the CLI:

{
    "computeType": "aks",
    "autoScaler":
    {
        "autoscaleEnabled": true,
        "minReplicas": 1,
        "maxReplicas": 3,
        "refreshPeriodInSeconds": 1,
        "targetUtilization": 70
    },
    "dataCollection":
    {
        "storageEnabled": true
    },
    "authEnabled": true,
    "containerResourceRequirements":
    {
        "cpu": 0.5,
        "memoryInGB": 1.0
    }
}

有关详细信息,请参阅 az ml model deploy 参考文档。For more information, see the az ml model deploy reference.

使用 VS CodeUsing VS Code

有关如何使用 VS Code 的信息,请参阅通过 VS Code 扩展部署到 AKSFor information on using VS Code, see deploy to AKS via the VS Code extension.

Important

通过 VS Code 进行部署要求提前创建 AKS 群集或将其附加到工作区。Deploying through VS Code requires the AKS cluster to be created or attached to your workspace in advance.

使用受控推出(预览版)将模型部署到 AKSDeploy models to AKS using controlled rollout (preview)

使用终结点以受控的方式分析和提升模型版本。Analyze and promote model versions in a controlled fashion using endpoints. 最多可以在一个终结点后方部署六个版本。You can deploy up to six versions behind a single endpoint. 终结点提供以下功能:Endpoints provide the following capabilities:

  • 配置__发送到每个终结点的评分流量百分比__。Configure the percentage of scoring traffic sent to each endpoint. 例如,将 20% 的流量路由到终结点“test”,将 80% 路由到“production”。For example, route 20% of the traffic to endpoint 'test' and 80% to 'production'.

    Note

    如果不按 100% 的流量计算,则所有剩余百分比的流量将路由到默认终结点版本____。If you do not account for 100% of the traffic, any remaining percentage is routed to the default endpoint version. 例如,如果将终结点版本“test”配置为获取 10% 的流量,将“prod”配置为 30%,则剩余的 60% 将发送到默认终结点版本。For example, if you configure endpoint version 'test' to get 10% of the traffic, and 'prod' for 30%, the remaining 60% is sent to the default endpoint version.

    创建的第一个终结点版本将自动配置为默认版本。The first endpoint version created is automatically configured as the default. 可通过在创建或更新终结点版本时设置 is_default=True 来更改此设置。You can change this by setting is_default=True when creating or updating an endpoint version.

  • 将终结点版本标记为“对照”或“实验”____ ____。Tag an endpoint version as either control or treatment. 例如,当前的生产终结点版本可能为“对照”版本,而可能的新模型将部署为“实验”版本。For example, the current production endpoint version might be the control, while potential new models are deployed as treatment versions. 评估“实验”版本的性能后,如果该版本优于当前的“对照”版本,则其可能会提升为新的生产/对照版本。After evaluating performance of the treatment versions, if one outperforms the current control, it might be promoted to the new production/control.

    Note

    只能有一个“对照”版本____。You can only have one control. 可以有多个“实验”版本。You can have multiple treatments.

可以启用 App Insights 来查看终结点和已部署版本的操作指标。You can enable app insights to view operational metrics of endpoints and deployed versions.

创建终结点Create an endpoint

做好部署模型的准备后,请创建一个评分终结点,并部署第一个版本。Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. 以下示例演示如何使用 SDK 部署和创建终结点。The following example shows how to deploy and create the endpoint using the SDK. 将第一个部署定义为默认版本,这意味着所有版本中未指定的百分比的流量都将流向默认版本。The first deployment will be defined as the default version, which means that unspecified traffic percentile across all versions will go to the default version.

Tip

在下面的示例中,所作配置将初始终结点版本设置为处理 20% 的流量。In the following example, the configuration sets the initial endpoint version to handle 20% of the traffic. 由于这是第一个终结点,因此它也是默认版本。Since this is the first endpoint, it's also the default version. 而且,由于我们没有用于处理其余 80% 流量的其他版本,因此这些流量也将其路由到默认版本。And since we don't have any other versions for the other 80% of traffic, it is routed to the default as well. 在部署了可处理一定百分比流量的其他版本以前,此版本实际将接收 100% 的流量。Until other versions that take a percentage of traffic are deployed, this one effectively receives 100% of the traffic.

import azureml.core,
from azureml.core.webservice import AksEndpoint
from azureml.core.compute import AksCompute
from azureml.core.compute import ComputeTarget
# select a created compute
compute = ComputeTarget(ws, 'myaks')
namespace_name= endpointnamespace
# define the endpoint and version name
endpoint_name = "mynewendpoint"
version_name= "versiona"
# create the deployment config and define the scoring traffic percentile for the first deployment
endpoint_deployment_config = AksEndpoint.deploy_configuration(cpu_cores = 0.1, memory_gb = 0.2,
                                                              enable_app_insights = True,
                                                              tags = {'sckitlearn':'demo'},
                                                              description = "testing versions",
                                                              version_name = version_name,
                                                              traffic_percentile = 20)
 # deploy the model and endpoint
 endpoint = Model.deploy(ws, endpoint_name, [model], inference_config, endpoint_deployment_config, compute)
 # Wait for he process to complete
 endpoint.wait_for_deployment(True)

更新版本并将其添加到终结点Update and add versions to an endpoint

将其他版本添加到终结点,并配置流向该版本的评分流量的百分比。Add another version to your endpoint and configure the scoring traffic percentile going to the version. 有两种类型的版本:控制版本和处理版本。There are two types of versions, a control and a treatment version. 可设置多个“实验”版本来帮助进行与单个“对照”版本之间的比较。There can be multiple treatment versions to help compare against a single control version.

Tip

由以下代码段创建的第二个版本可接受 10% 的流量。The second version, created by the following code snippet, accepts 10% of traffic. 第一个版本配置为 20%,因此总共仅为特定版本配置了 30% 的流量。The first version is configured for 20%, so only 30% of the traffic is configured for specific versions. 剩余的 70% 将发送到第一个终结点版本,因为它也是默认版本。The remaining 70% is sent to the first endpoint version, because it is also the default version.

from azureml.core.webservice import AksEndpoint

# add another model deployment to the same endpoint as above
version_name_add = "versionb"
endpoint.create_version(version_name = version_name_add,
                       inference_config=inference_config,
                       models=[model],
                       tags = {'modelVersion':'b'},
                       description = "my second version",
                       traffic_percentile = 10)
endpoint.wait_for_deployment(True)

更新或删除终结点中的现有版本。Update existing versions or delete them in an endpoint. 可更改版本的默认类型、控件类型和流量百分比。You can change the version's default type, control type, and the traffic percentile. 在下面的示例中,第二个版本会将其流量增加到 40% 且其现在为默认版本。In the following example, the second version increases its traffic to 40% and is now the default.

Tip

运行以下代码段之后,现在第二个版本变为默认版本。After the following code snippet, the second version is now default. 它现在配置为 40%,而原始版本仍配置为 20%。It is now configured for 40%, while the original version is still configured for 20%. 这意味着,还有 40% 的流量未计入版本配置。This means that 40% of traffic is not accounted for by version configurations. 剩余的流量将路由到第二个版本,因为它现在为默认版本。The leftover traffic will be routed to the second version, because it is now default. 它实际上接收了 80% 的流量。It effectively receives 80% of the traffic.

from azureml.core.webservice import AksEndpoint

# update the version's scoring traffic percentage and if it is a default or control type
endpoint.update_version(version_name=endpoint.versions["versionb"].name,
                       description="my second version update",
                       traffic_percentile=40,
                       is_default=True,
                       is_control_version_type=True)
# Wait for the process to complete before deleting
endpoint.wait_for_deployment(true)
# delete a version in an endpoint
endpoint.delete_version(version_name="versionb")

Web 服务身份验证Web service authentication

部署到 Azure Kubernetes 服务时,默认会启用基于密钥的身份验证____。When deploying to Azure Kubernetes Service, key-based authentication is enabled by default. 此外,还可以启用基于令牌的身份验证____。You can also enable token-based authentication. 基于令牌的身份验证要求客户端使用 Azure Active Directory 帐户来请求身份验证令牌,该令牌用于向已部署的服务发出请求。Token-based authentication requires clients to use an Azure Active Directory account to request an authentication token, which is used to make requests to the deployed service.

要禁用身份验证,请在创建部署配置时设置 auth_enabled=False 参数____。To disable authentication, set the auth_enabled=False parameter when creating the deployment configuration. 下面的示例使用 SDK 来禁用身份验证:The following example disables authentication using the SDK:

deployment_config = AksWebservice.deploy_configuration(cpu_cores=1, memory_gb=1, auth_enabled=False)

有关如何从客户端应用程序进行身份验证的信息,请参阅使用部署为 Web 服务的 Azure 机器学习模型For information on authenticating from a client application, see the Consume an Azure Machine Learning model deployed as a web service.

使用密钥进行身份验证Authentication with keys

如果已启用密钥身份验证,可以使用 get_keys 方法来检索主要和辅助身份验证密钥:If key authentication is enabled, you can use the get_keys method to retrieve a primary and secondary authentication key:

primary, secondary = service.get_keys()
print(primary)

Important

如需重新生成密钥,请使用 service.regen_keyIf you need to regenerate a key, use service.regen_key

使用令牌进行身份验证Authentication with tokens

要启用令牌身份验证,请在创建或更新部署时设置 token_auth_enabled=True 参数。To enable token authentication, set the token_auth_enabled=True parameter when you are creating or updating a deployment. 下面的示例使用 SDK 来启用令牌身份验证:The following example enables token authentication using the SDK:

deployment_config = AksWebservice.deploy_configuration(cpu_cores=1, memory_gb=1, token_auth_enabled=True)

如果启用了令牌身份验证,可以使用 get_token 方法来检索 JWT 令牌以及该令牌的到期时间:If token authentication is enabled, you can use the get_token method to retrieve a JWT token and that token's expiration time:

token, refresh_by = service.get_token()
print(token)

Important

需要在令牌的 refresh_by 时间后请求一个新令牌。You will need to request a new token after the token's refresh_by time.

Microsoft 强烈建议在 Azure Kubernetes 服务群集所在的相同区域中创建 Azure 机器学习工作区。Microsoft strongly recommends that you create your Azure Machine Learning workspace in the same region as your Azure Kubernetes Service cluster. 要使用令牌进行身份验证,Web 服务将调用创建 Azure 机器学习工作区的区域。To authenticate with a token, the web service will make a call to the region in which your Azure Machine Learning workspace is created. 如果工作区区域不可用,即使群集和工作区不在同一区域,也将无法获取 Web 服务的令牌。If your workspace's region is unavailable, then you will not be able to fetch a token for your web service even, if your cluster is in a different region than your workspace. 这实际上会导致在工作区的区域再次可用之前,基于令牌的身份验证不可用。This effectively results in Token-based Authentication being unavailable until your workspace's region is available again. 此外,群集区域和工作区区域的距离越远,获取令牌所需的时间就越长。In addition, the greater the distance between your cluster's region and your workspace's region, the longer it will take to fetch a token.

更新 Web 服务Update the web service

若要更新 Web 服务,请使用 update 方法。To update a web service, use the update method. 你可以更新 Web 服务,以使用可以在推理配置中指定的新模型、新入口脚本或新依赖项。You can update the web service to use a new model, a new entry script, or new dependencies that can be specified in an inference configuration. 有关详细信息,请参阅 Webservice.update 的文档。For more information, see the documentation for Webservice.update.

Important

创建模型的新版本时,必须手动更新要使用的每个服务。When you create a new version of a model, you must manually update each service that you want to use it.

不能使用 SDK 来更新从 Azure 机器学习设计器发布的 Web 服务。You can not use the SDK to update a web service published from the Azure Machine Learning designer.

使用 SDKUsing the SDK

下面的代码演示如何使用 SDK 更新 Web 服务的模型、环境和入口脚本:The following code shows how to use the SDK to update the model, environment, and entry script for a web service:

from azureml.core import Environment
from azureml.core.webservice import Webservice
from azureml.core.model import Model, InferenceConfig

# Register new model.
new_model = Model.register(model_path="outputs/sklearn_mnist_model.pkl",
                           model_name="sklearn_mnist",
                           tags={"key": "0.1"},
                           description="test",
                           workspace=ws)

# Use version 3 of the environment.
deploy_env = Environment.get(workspace=ws,name="myenv",version="3")
inference_config = InferenceConfig(entry_script="score.py",
                                   environment=deploy_env)

service_name = 'myservice'
# Retrieve existing service.
service = Webservice(name=service_name, workspace=ws)



# Update to new model(s).
service.update(models=[new_model], inference_config=inference_config)
print(service.state)
print(service.get_logs())

使用 CLIUsing the CLI

还可以使用 ML CLI 更新 Web 服务。You can also update a web service by using the ML CLI. 以下示例演示如何注册新模型,然后更新 Web 服务以使用新模型:The following example demonstrates registering a new model and then updating a web service to use the new model:

az ml model register -n sklearn_mnist  --asset-path outputs/sklearn_mnist_model.pkl  --experiment-name myexperiment --output-metadata-file modelinfo.json
az ml service update -n myservice --model-metadata-file modelinfo.json

Tip

此示例使用 JSON 文档将模型信息从注册命令传递到更新命令。In this example, a JSON document is used to pass the model information from the registration command into the update command.

若要更新服务以使用新的入口脚本或环境,请创建推理配置文件并使用 ic 参数指定它。To update the service to use a new entry script or environment, create an inference configuration file and specify it with the ic parameter.

有关详细信息,请参阅 az ml 服务更新文档。For more information, see the az ml service update documentation.

后续步骤Next steps