Azure 机器学习工作室(经典版)中的应用程序生命周期管理Application Lifecycle Management in Azure Machine Learning Studio (classic)

适用于: yes机器学习工作室(经典) noAzure 机器学习APPLIES TO: yesMachine Learning Studio (classic) noAzure Machine Learning

Azure 机器学习工作室(经典版)是一个在 Azure 云平台中运行的工具,用于开发机器学习实验。Azure Machine Learning Studio (classic) is a tool for developing machine learning experiments that are operationalized in the Azure cloud platform. 它类似于将 Visual Studio IDE 和可缩放云服务合并到单个平台。It is like the Visual Studio IDE and scalable cloud service merged into a single platform. 可以将标准的应用程序生命周期管理 (ALM) 实践(从各种资产的版本管理到自动执行和部署)合并到 Azure 机器学习工作室(经典版)中。You can incorporate standard Application Lifecycle Management (ALM) practices from versioning various assets to automated execution and deployment, into Azure Machine Learning Studio (classic). 本文介绍一些选项和方法。This article discusses some of the options and approaches.

备注

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

实验的版本控制Versioning experiment

有两种用于控制实验版本的建议方法。There are two recommended ways to version your experiments. 可以依赖内置的运行历史记录,或以 JSON 格式导出实验以在外部管理它。You can either rely on the built-in run history or export the experiment in a JSON format so as to manage it externally. 每种方法各有利弊。Each approach comes with its pros and cons.

使用运行历史记录的实验快照Experiment snapshots using Run History

在 Azure 机器学习工作室(经典版)学习实验的执行模型中,每次单击实验编辑器中的“运行”按钮时,都会将该实验的不可变快照提交到作业计划程序。In the execution model of the Azure Machine Learning Studio (classic) learning experiment, an immutable snapshot of the experiment is submitted to the job scheduler whenever you click Run in the experiment editor. 要查看此快照列表,单击实验编辑器视图中命令栏上的“运行历史记录”按钮。To view this list of snapshots, click Run History on the command bar in the experiment editor view.

“运行历史记录”按钮

然后,按以下方式操作即可在锁定模式下打开快照:在提交试验供运行并创建快照时,单击该试验的名称。You can then open the snapshot in Locked mode by clicking the name of the experiment at the time the experiment was submitted to run and the snapshot was taken. 请注意,仅列表中表示当前实验的第一项处于“可编辑”状态。Notice that only the first item in the list, which represents the current experiment, is in an Editable state. 另请注意,每个快照同样可以处于各种状态,包括完成(部分运行)、失败、失败(部分运行)或草稿。Also notice that each snapshot can be in various Status states as well, including Finished (Partial run), Failed, Failed (Partial run), or Draft.

“运行历史记录”列表

打开后,可以将快照实验另存为新的实验,然后修改它。After it's opened, you can save the snapshot experiment as a new experiment and then modify it. 如果实验快照包含训练模型、转换、数据集等资产,由于版本已更新,该快照保留的引用是捕获快照时的原始版本。If your experiment snapshot contains assets such as trained models, transforms, or datasets that have updated versions, the snapshot retains the references to the original version when the snapshot was taken. 如果已将锁定快照另存为新实验,Azure 机器学习工作室(经典版)会检测是否存在这些资产的更新版本,并会在新实验中自动更新资产。If you save the locked snapshot as a new experiment, Azure Machine Learning Studio (classic) detects the existence of a newer version of these assets, and automatically updates them in the new experiment.

如果删除实验,则会删除该实验的所有快照。If you delete the experiment, all snapshots of that experiment are deleted.

采用 JSON 格式导出/导入实验Export/import experiment in JSON format

每次提交要运行的实验时,运行历史记录快照都会在 Azure 机器学习工作室(经典版)中保留该实验的不可变版本。The run history snapshots keep an immutable version of the experiment in Azure Machine Learning Studio (classic) every time it is submitted to run. 也可保存实验的本地副本并将其签入最常用的源代码管理系统(例如 Team Foundation Server),然后通过该本地文件重新创建实验。You can also save a local copy of the experiment and check it in to your favorite source control system, such as Team Foundation Server, and later on re-create an experiment from that local file. 可以使用 Azure 机器学习 PowerShell commandlet Export-AmlExperimentGraphImport-AmlExperimentGraph 来实现导出/导入操作。You can use the Azure Machine Learning PowerShell commandlets Export-AmlExperimentGraph and Import-AmlExperimentGraph to accomplish that.

JSON 文件是实验图的文本表示形式,可能包含对工作区中数据集或训练模型等资产的引用。The JSON file is a textual representation of the experiment graph, which might include a reference to assets in the workspace such as a dataset or a trained model. 它不包含资产的序列化版本。It doesn't contain a serialized version of the asset. 如果尝试将 JSON 文档导回到工作区,引用的资产中必须已经存有实验中引用的相同资产 ID,If you attempt to import the JSON document back into the workspace, the referenced assets must already exist with the same asset IDs that are referenced in the experiment. 否则将无法访问导入的试验。Otherwise you cannot access the imported experiment.

训练模型的版本控制Versioning trained model

Azure 机器学习工作室(经典版)中的训练模型序列化为称为 iLearner 文件 (.iLearner) 的格式,并存储在与工作区关联的 Azure Blob 存储帐户中。A trained model in Azure Machine Learning Studio (classic) is serialized into a format known as an iLearner file (.iLearner), and is stored in the Azure Blob storage account associated with the workspace. 获取 iLearner 文件副本的一种方法是重新训练 API。One way to get a copy of the iLearner file is through the retraining API. 本文介绍如何对 API 重新训练。This article explains how the retraining API works. 概略性步骤:The high-level steps:

  1. 设置训练实验。Set up your training experiment.
  2. 将 Web 服务输出端口添加到“训练”模块或生成训练模型(如调整模型超参数或创建 R 模型)的模块。Add a web service output port to the Train Model module, or the module that produces the trained model, such as Tune Model Hyperparameter or Create R Model.
  3. 运行训练实验,然后将其部署为模型训练 Web 服务。Run your training experiment and then deploy it as a model training web service.
  4. 调用对 Web 服务训练的 BES 终结点,并指定所需的 iLearner 文件名以及将存储它的 Blob 存储帐户位置。Call the BES endpoint of the training web service, and specify the desired iLearner file name and Blob storage account location where it will be stored.
  5. BES 调用完成后,即可获得生成的 iLearner 文件。Harvest the produced iLearner file after the BES call finishes.

检索 iLearner 文件的另一种方法是通过 PowerShell commandlet Download-AmlExperimentNodeOutputAnother way to retrieve the iLearner file is through the PowerShell commandlet Download-AmlExperimentNodeOutput. 如果只想获取 iLearner 文件的副本,不需要以编程方式重新训练模型,此方法可能比较容易。This might be easier if you just want to get a copy of the iLearner file without the need to retrain the model programmatically.

有了 iLearner 文件(包含训练的模型)以后,即可采用自己的版本控制策略。After you have the iLearner file containing the trained model, you can then employ your own versioning strategy. 该策略就像将前缀/后缀作为命名约定应用一样简单,只需将 iLearner 文件保留在 Blob 存储中即可,也可将其复制/导入到版本控制系统中。The strategy can be as simple as applying a pre/postfix as a naming convention and just leaving the iLearner file in Blob storage, or copying/importing it into your version control system.

之后,保存的 iLearner 文件可用于通过部署的 Web 服务进行评分。The saved iLearner file can then be used for scoring through deployed web services.

Web 服务的版本控制Versioning web service

可以从 Azure 机器学习工作室(经典版)实验部署两种类型的 Web 服务。You can deploy two types of web services from an Azure Machine Learning Studio (classic) experiment. 经典 Web 服务与实验以及工作区紧密耦合。The classic web service is tightly coupled with the experiment as well as the workspace. 新的 Web 服务使用 Azure 资源管理器框架,不再与原始实验或工作区耦合。The new web service uses the Azure Resource Manager framework, and it is no longer coupled with the original experiment or the workspace.

经典 Web 服务Classic web service

若要对经典 Web 服务进行版本控制,可以利用 Web 服务终结点构造。To version a classic web service, you can take advantage of the web service endpoint construct. 典型工作流如下所示:Here is a typical flow:

  1. 从预测实验,部署新的经典 Web 服务(包含默认终结点)。From your predictive experiment, you deploy a new classic web service, which contains a default endpoint.
  2. 创建名为 ep2 的新终结点(显示实验/训练模型的当前版本)。You create a new endpoint named ep2, which exposes the current version of the experiment/trained model.
  3. 返回并更新预测实验和训练模型。You go back and update your predictive experiment and trained model.
  4. 重新部署预测实验,这会更新默认终结点。You redeploy the predictive experiment, which will then update the default endpoint. 但此操作不会更改 ep2。But this will not alter ep2.
  5. 创建名为 ep3 的另一终结点,以便公开新版实验和训练模型。You create an additional endpoint named ep3, which exposes the new version of the experiment and trained model.
  6. 必要时返回到步骤 3。Go back to step 3 if needed.

随着时间的推移,就会在同一 Web 服务中创建多个终结点。Over time, you might have many endpoints created in the same web service. 每一个终结点都代表实验的一个时间点副本,其中包含训练模型的时间点版本。Each endpoint represents a point-in-time copy of the experiment containing the point-in-time version of the trained model. 可以使用外部逻辑确定要调用哪一个终结点,这实际上意味着为评分运行选择某个版本的训练模型。You can then use external logic to determine which endpoint to call, which effectively means selecting a version of the trained model for the scoring run.

还可以创建许多相同的 Web 服务终结点,然后将不同版本的 iLearner 文件修补到要实现类似效果的终结点。You can also create many identical web service endpoints, and then patch different versions of the iLearner file to the endpoint to achieve similar effect. 本文更详细地介绍了如何完成此操作。This article explains in more detail how to accomplish that.

新的 Web 服务New web service

如果创建新的基于 Azure 资源管理器的 Web 服务,终结点构造不再可用。If you create a new Azure Resource Manager-based web service, the endpoint construct is no longer available. 相反,可以通过以下两种方法生成 JSON 格式的 Web 服务定义 (WSD) 文件:使用 Export-AmlWebServiceDefinitionFromExperiment PowerShell commandlet 从预测实验,或使用 Export-AzMlWebservice PowerShell commandlet 从已部署的基于 Resource Manager 的 Web 服务。Instead, you can generate web service definition (WSD) files, in JSON format, from your predictive experiment by using the Export-AmlWebServiceDefinitionFromExperiment PowerShell commandlet, or by using the Export-AzMlWebservice PowerShell commandlet from a deployed Resource Manager-based web service.

有了导出的 WSD 文件并可对其进行版本控制以后,还可以将 WSD 部署为不同 Azure 区域中不同 Web 服务计划中的新 Web 服务。After you have the exported WSD file and version control it, you can also deploy the WSD as a new web service in a different web service plan in a different Azure region. 只需确保提供正确的存储帐户配置以及新的 Web 服务计划 ID。Just make sure you supply the proper storage account configuration as well as the new web service plan ID. 要修补其他 iLearner 文件,可以修改 WSD 文件、更新训练模型的位置引用,然后将其部署为新的 Web 服务。To patch in different iLearner files, you can modify the WSD file and update the location reference of the trained model, and deploy it as a new web service.

自动化实验执行和部署Automate experiment execution and deployment

ALM 的一个重要方面是能够自动化应用程序的执行和部署过程。An important aspect of ALM is to be able to automate the execution and deployment process of the application. 在 Azure 机器学习工作室(经典版)中,可以使用 PowerShell 模块完成此操作。In Azure Machine Learning Studio (classic), you can accomplish this by using the PowerShell module. 下面举例说明与使用 Azure 机器学习工作室(经典版)PowerShell 模块自动化执行/部署过程的标准 ALM 有关的端到端步骤。Here is an example of end-to-end steps that are relevant to a standard ALM automated execution/deployment process by using the Azure Machine Learning Studio (classic) PowerShell module. 每个步骤都链接到一个或多个用于完成该步骤的 PowerShell cmdlet。Each step is linked to one or more PowerShell commandlets that you can use to accomplish that step.

  1. 上传数据集Upload a dataset.
  2. 将训练实验从工作区复制到工作区,或者导入从本地磁盘中导出的实验。Copy a training experiment into the workspace from a workspace or from Gallery, or import an exported experiment from local disk.
  3. 在训练实验中更新数据集Update the dataset in the training experiment.
  4. 运行训练实验Run the training experiment.
  5. 提升训练模型Promote the trained model.
  6. 复制预测实验到工作区。Copy a predictive experiment into the workspace.
  7. 在预测实验中更新训练模型Update the trained model in the predictive experiment.
  8. 运行预测实验Run the predictive experiment.
  9. 从预测实验部署 Web 服务Deploy a web service from the predictive experiment.
  10. 测试 Web 服务 RRSBES 终结点。Test the web service RRS or BES endpoint.

后续步骤Next steps