重新训练和部署机器学习模型Retrain and deploy a machine learning model

适用于: yes机器学习工作室(经典) noAzure 机器学习APPLIES TO: yesMachine Learning Studio (classic) noAzure Machine Learning

重新训练是确保机器学习模型保持准确的一种方法,该方法基于最相关的可用数据。Retraining is one way to ensure machine learning models stay accurate and based on the most relevant data available. 本文展示了如何在工作室(经典版)中重新训练机器学习模型并将其部署为新的 Web 服务。This article shows how to retrain and deploy a machine learning model as a new web service in Studio (classic). 如果要重新训练经典 Web 服务,请参阅此操作说明文章If you're looking to retrain a classic web service, view this how-to article.

本文假设你已部署了一个预测 Web 服务。This article assumes you already have a predictive web service deployed. 如果还没有预测 Web 服务,请了解如何在此处部署工作室(经典版)Web 服务If you don't already have a predictive web service, learn how to deploy a Studio (classic) web service here.

你将执行以下步骤来重新训练和部署机器学习新 Web 服务:You'll follow these steps to retrain and deploy a machine learning new web service:

  1. 部署重新训练 Web 服务Deploy a retraining web service
  2. 使用重新训练 Web 服务来训练新模型Train a new model using your retraining web service
  3. 将现有的预测实验更新为使用新模型Update your existing predictive experiment to use the new model

备注

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

部署重新训练 Web 服务Deploy the retraining web service

使用重新训练 Web 服务,可以使用一组新的参数(例如新数据)来重新训练模型,并保存它供以后使用。A retraining web service lets you retrain your model with a new set of parameters, like new data, and save it for later. Web 服务输出连接到训练模型时,训练实验将输出一个新模型供你使用。When you connect a Web Service Output to a Train Model, the training experiment outputs a new model for you to use.

使用以下步骤来部署重新训练 Web 服务:Use the following steps to deploy a retraining web service:

  1. 将一个 Web 服务输入模块连接到你的数据输入。Connect a Web Service Input module to your data input. 通常情况下,想要确保输入数据的处理方式与原始训练数据的处理方式相同。Typically, you want to ensure that your input data is processed in the same way as your original training data.

  2. 将一个 Web 服务输出模块连接到训练模型的输出。Connect a Web Service Output module to the output of your Train Model.

  3. 如果你有一个评估模型模块,则可以连接一个 Web 服务输出模块来输出评估结果。If you have an Evaluate Model module, you can connect a Web Service Output module to output the evaluation results

  4. 运行实验。Run your experiment.

    运行实验之后,生成的工作流应当类似于下图:After running your experiment, the resulting workflow should be similar to the following image:

    生成的工作流

    现在,将训练实验部署为重新训练 Web 服务,用于输出经过训练的模型和模型评估结果。Now, you deploy the training experiment as a retraining web service that outputs a trained model and model evaluation results.

  5. 在实验画布的底部,单击“设置 Web 服务” 。At the bottom of the experiment canvas, click Set Up Web Service

  6. 选择“部署 Web 服务[新建]” 。Select Deploy Web Service [New]. Azure 机器学习 Web 服务门户可打开“部署 Web 服务” 页。The Azure Machine Learning Web Services portal opens to the Deploy Web Service page.

  7. 为 Web 服务键入名称,选择一个付款计划。Type a name for your web service and choose a payment plan.

  8. 选择“部署”。 Select Deploy.

重新训练模型Retrain the model

对于此示例,我们使用 C# 创建重新训练的应用程序。For this example, we're using C# to create the retraining application. 还可使用 Python 或 R 示例代码完成此任务。You can also use Python or R sample code to accomplish this task.

使用以下步骤来调用重新训练 API:Use the following steps to call the retraining APIs:

  1. 在 Visual Studio 中创建 C# 控制台应用程序:“新建” > “项目” > “Visual C#” > “Windows 经典桌面” > “控制台应用(.NET Framework)” 。Create a C# console application in Visual Studio: New > Project > Visual C# > Windows Classic Desktop > Console App (.NET Framework).
  2. 登录“机器学习 Web 服务”门户。Sign in to the Machine Learning Web Services portal.
  3. 单击正在使用的 Web 服务。Click the web service that you're working with.
  4. 单击“使用”。Click Consume.
  5. 在“使用”页底部的“示例代码”部分中,单击“Batch”。At the bottom of the Consume page, in the Sample Code section, click Batch.
  6. 复制 Batch 执行的示例 C# 代码,并将其粘贴到 Program.cs 文件。Copy the sample C# code for batch execution and paste it into the Program.cs file. 请确保命名空间保持不变。Make sure that the namespace remains intact.

按照注释中指定的方式添加 NuGet 包 Microsoft.AspNet.WebApi.Client。Add the NuGet package Microsoft.AspNet.WebApi.Client, as specified in the comments. 要添加对 Microsoft.WindowsAzure.Storage.dll 的引用,可能需要安装 Azure 存储服务的客户端库To add the reference to Microsoft.WindowsAzure.Storage.dll, you might need to install the client library for Azure Storage services.

以下屏幕截图显示 Azure 机器学习 Web 服务门户中的“使用” 页。The following screenshot shows the Consume page in the Azure Machine Learning Web Services portal.

使用页

更新 apiKey 声明Update the apikey declaration

定位 apikey 声明:Locate the apikey declaration:

const string apiKey = "abc123"; // Replace this with the API key for the web service

在“使用”页的“基本使用信息”部分中,找到主密钥,并将其复制到 apikey 声明。In the Basic consumption info section of the Consume page, locate the primary key, and copy it to the apikey declaration.

更新 Azure 存储信息Update the Azure Storage information

BES 示例代码将文件从本地驱动器(例如,“C:\temp\CensusInput.csv”)上传到 Azure 存储、对其进行处理,并将结果写回 Azure 存储。The BES sample code uploads a file from a local drive (for example, "C:\temp\CensusInput.csv") to Azure Storage, processes it, and writes the results back to Azure Storage.

  1. 登录到 Azure 门户Sign into the Azure portal
  2. 在左侧导航栏中,单击“更多服务” ,搜索“存储帐户” ,然后选择它。In the left navigation column, click More services, search for Storage accounts, and select it.
  3. 从存储帐户列表中,选择一个来存储重新训练模型。From the list of storage accounts, select one to store the retrained model.
  4. 在左侧导航栏中,单击“访问密钥” 。In the left navigation column, click Access keys.
  5. 复制并保存“主访问密钥” 。Copy and save the Primary Access Key.
  6. 在左侧导航列中,单击“Blob”。 In the left navigation column, click Blobs.
  7. 选择现有容器或创建新的容器并保存名称。Select an existing container, or create a new one and save the name.

找到“StorageAccountName” 、“StorageAccountKey” 和“StorageContainerName” 声明,然后更新从门户保存的值。Locate the StorageAccountName, StorageAccountKey, and StorageContainerName declarations, and update the values that you saved from the portal.

const string StorageAccountName = "mystorageacct"; // Replace this with your Azure storage account name
const string StorageAccountKey = "a_storage_account_key"; // Replace this with your Azure Storage key
const string StorageContainerName = "mycontainer"; // Replace this with your Azure Storage container name

还必须确保输入文件在代码中指定的位置上可用。You also must ensure that the input file is available at the location that you specify in the code.

指定输出位置Specify the output location

在“请求有效负载”中指定输出位置时,在 RelativeLocation 中指定的文件的扩展必须指定为 ilearnerWhen you specify the output location in the Request Payload, the extension of the file that is specified in RelativeLocation must be specified as ilearner.

Outputs = new Dictionary<string, AzureBlobDataReference>() {
    {
        "output1",
        new AzureBlobDataReference()
        {
            ConnectionString = storageConnectionString,
            RelativeLocation = string.Format("{0}/output1results.ilearner", StorageContainerName) /*Replace this with the location you want to use for your output file and a valid file extension (usually .csv for scoring results or .ilearner for trained models)*/
        }
    },

下面是重新训练输出的示例:Here is an example of retraining output:

重新训练输出

评估重新训练结果Evaluate the retraining results

运行应用程序时,输出包括 URL 和访问评估结果所需的共享访问签名令牌。When you run the application, the output includes the URL and shared access signatures token that are necessary to access the evaluation results.

通过组合 output2 的输出结果中的 BaseLocationRelativeLocationSasBlobToken 并在浏览器地址栏中粘贴完整的 URL,可以查看重新训练模型的性能结果。You can see the performance results of the retrained model by combining the BaseLocation, RelativeLocation, and SasBlobToken from the output results for output2 and pasting the complete URL into the browser address bar.

检查结果以确定新训练的模型是否比现有模型的表现更好。Examine the results to determine if the newly trained model performs better than the existing one.

保存输出结果中的 BaseLocationRelativeLocationSasBlobTokenSave the BaseLocation, RelativeLocation, and SasBlobToken from the output results.

更新预测实验Update the predictive experiment

登录到 Azure 资源管理器Sign in to Azure Resource Manager

首先,从 PowerShell 环境中使用 Connect-AzAccount cmdlet 登录到 Azure 帐户。First, sign in to your Azure account from within the PowerShell environment by using the Connect-AzAccount cmdlet.

获取 Web 服务定义对象Get the Web Service Definition object

然后,通过调用 Get-AzMlWebService cmdlet 获取 Web 服务定义对象。Next, get the Web Service Definition object by calling the Get-AzMlWebService cmdlet.

$wsd = Get-AzMlWebService -Name 'RetrainSamplePre.2016.8.17.0.3.51.237' -ResourceGroupName 'Default-MachineLearning-SouthCentralUS'

若要确定现有 Web 服务的资源组名称,请运行 Get-AzMlWebService cmdlet 而不是任何参数,以显示订阅中的 Web 服务。To determine the resource group name of an existing web service, run the Get-AzMlWebService cmdlet without any parameters to display the web services in your subscription. 定位到 Web 服务,并查看其 Web 服务 ID。Locate the web service, and then look at its web service ID. 资源组的名称是 ID 中的第四个元素,紧随 resourceGroups 元素之后。The name of the resource group is the fourth element in the ID, just after the resourceGroups element. 在下面的示例中,资源组名称为 Default-MachineLearning-SouthCentralUS。In the following example, the resource group name is Default-MachineLearning-SouthCentralUS.

Properties : Microsoft.Azure.Management.MachineLearning.WebServices.Models.WebServicePropertiesForGraph
Id : /subscriptions/<subscription ID>/resourceGroups/Default-MachineLearning-SouthCentralUS/providers/Microsoft.MachineLearning/webServices/RetrainSamplePre.2016.8.17.0.3.51.237
Name : RetrainSamplePre.2016.8.17.0.3.51.237
Location : South Central US
Type : Microsoft.MachineLearning/webServices
Tags : {}

或者,若要确定现有 Web 服务的资源组名称,请登录 Azure Microsoft Azure 机器学习 Web 服务门户。Alternatively, to determine the resource group name of an existing web service, sign in to the Azure Machine Learning Web Services portal. 选择 Web 服务。Select the web service. 资源组名称是 Web 服务的 URL 的第五个元素,紧随 resourceGroups 元素之后。The resource group name is the fifth element of the URL of the web service, just after the resourceGroups element. 在下面的示例中,资源组名称为 Default-MachineLearning-SouthCentralUS。In the following example, the resource group name is Default-MachineLearning-SouthCentralUS.

https://services.azureml.net/subscriptions/<subscription ID>/resourceGroups/Default-MachineLearning-SouthCentralUS/providers/Microsoft.MachineLearning/webServices/RetrainSamplePre.2016.8.17.0.3.51.237

将 Web 服务定义对象导出为 JSONExport the Web Service Definition object as JSON

要修改训练模型定义以使用新训练的模型,必须先使用 Export-AzMlWebService cmdlet 将其导出到 JSON 格式的文件。To modify the definition of the trained model to use the newly trained model, you must first use the Export-AzMlWebService cmdlet to export it to a JSON-format file.

Export-AzMlWebService -WebService $wsd -OutputFile "C:\temp\mlservice_export.json"

将引用更新到 iLearner blobUpdate the reference to the ilearner blob

在资产中,定位到 [训练的模型],使用 iLearner blob 的 URI 更新 locationInfo 节点中的 URI 值。In the assets, locate the [trained model], update the uri value in the locationInfo node with the URI of the ilearner blob. 通过组合 BES 重新训练调用的输出结果中的 BaseLocationRelativeLocation 生成 URI。The URI is generated by combining the BaseLocation and the RelativeLocation from the output of the BES retraining call.

 "asset3": {
    "name": "Retrain Sample [trained model]",
    "type": "Resource",
    "locationInfo": {
      "uri": "https://mltestaccount.blob.core.chinacloudapi.cn/azuremlassetscontainer/baca7bca650f46218633552c0bcbba0e.ilearner"
    },
    "outputPorts": {
      "Results dataset": {
        "type": "Dataset"
      }
    }
  },

将 JSON 导入到 Web 服务定义对象Import the JSON into a Web Service Definition object

使用 Import-AzMlWebService cmdlet 将修改的 JSON 文件转换回可用于更新预测实验的 Web 服务定义对象。Use the Import-AzMlWebService cmdlet to convert the modified JSON file back into a Web Service Definition object that you can use to update the predicative experiment.

$wsd = Import-AzMlWebService -InputFile "C:\temp\mlservice_export.json"

更新 Web 服务Update the web service

最后,使用 Update-AzMlWebService cmdlet 更新预测实验。Finally, use the Update-AzMlWebService cmdlet to update the predictive experiment.

Update-AzMlWebService -Name 'RetrainSamplePre.2016.8.17.0.3.51.237' -ResourceGroupName 'Default-MachineLearning-SouthCentralUS'

后续步骤Next steps

若要了解有关如何管理 Web 服务或跟踪多个实验运行的详细信息,请参阅以下文章:To learn more about how to manage web services or keep track of multiple experiments runs, see the following articles: