监视机器学习 Web 服务终结点以及从中收集数据Monitor and collect data from ML web service endpoints

在本文中,你将了解如何从部署到 Azure Kubernetes 服务 (AKS) 或 Azure 容器实例 (ACI) 中 Web 服务终结点的模型收集数据。In this article, you learn how to collect data from models deployed to web service endpoints in Azure Kubernetes Service (AKS) or Azure Container Instances (ACI). 使用 Azure Application Insights 从终结点收集以下数据:Use Azure Application Insights to collect the following data from an endpoint:

  • 输出数据Output data
  • 响应Responses
  • 请求速率、响应时间和失败率Request rates, response times, and failure rates
  • 依赖项速率、响应时间和失败率Dependency rates, response times, and failure rates
  • 异常Exceptions

enable-app-insights-in-production-service.ipynb 笔记本演示了本文所述的概念。The enable-app-insights-in-production-service.ipynb notebook demonstrates concepts in this article.

阅读使用 Jupyter 笔记本探索此服务一文,了解如何运行笔记本。Learn how to run notebooks by following the article Use Jupyter notebooks to explore this service.

先决条件Prerequisites

使用 Python SDK 配置日志记录Configure logging with the Python SDK

本部分介绍如何使用 Python SDK 启用 Application Insights 日志记录。In this section, you learn how to enable Application Insight logging by using the Python SDK.

更新已部署的服务Update a deployed service

使用以下步骤更新现有的 Web 服务:Use the following steps to update an existing web service:

  1. 在工作区中标识该服务。Identify the service in your workspace. ws 的值是工作区的名称The value for ws is the name of your workspace

    from azureml.core.webservice import Webservice
    aks_service= Webservice(ws, "my-service-name")
    
  2. 更新服务并启用 Azure Application InsightsUpdate your service and enable Azure Application Insights

    aks_service.update(enable_app_insights=True)
    

在服务中记录自定义跟踪Log custom traces in your service

重要

Azure Application Insights 仅记录最多 64kb 的有效负载。Azure Application Insights only logs payloads of up to 64kb. 如果达到此限制,则可能会出现诸如内存不足或不会记录任何信息之类的错误。If this limit is reached, you may see errors such as out of memory, or no information may be logged. 如果要记录的数据大于 64kb,应使用为生产环境中的模型收集数据中的信息,将其存储到 Blob 存储中。If the data you want to log is larger 64kb, you should instead store it to blob storage using the information in Collect Data for models in production.

对于更复杂的情况(如 AKS 部署中的模型跟踪),我们建议使用第三方库,如 OpenCensusFor more complex situations, like model tracking within an AKS deployment, we recommend using a third-party library like OpenCensus.

若要记录自定义跟踪,请遵循部署方式和部署位置文档中适用于 AKS 或 ACI 的标准部署过程。To log custom traces, follow the standard deployment process for AKS or ACI in the How to deploy and where document. 然后,使用以下步骤:Then, use the following steps:

  1. 通过添加 print 语句来更新计分文件,以在推理期间将数据发送到 Application Insights。Update the scoring file by adding print statements to send data to Application Insights during inference. 对于更复杂的信息(例如请求数据和响应),请使用 JSON 结构。For more complex information, such as the request data and the response, use a JSON structure.

    下面的示例 score.py 文件记录模型初始化的时间、推理期间的输入和输出以及发生任何错误的时间。The following example score.py file logs when the model was initialized, input and output during inference, and the time any errors occur.

    import pickle
    import json
    import numpy 
    from sklearn.externals import joblib
    from sklearn.linear_model import Ridge
    from azureml.core.model import Model
    import time
    
    def init():
        global model
        #Print statement for appinsights custom traces:
        print ("model initialized" + time.strftime("%H:%M:%S"))
    
        # note here "sklearn_regression_model.pkl" is the name of the model registered under the workspace
        # this call should return the path to the model.pkl file on the local disk.
        model_path = Model.get_model_path(model_name = 'sklearn_regression_model.pkl')
    
        # deserialize the model file back into a sklearn model
        model = joblib.load(model_path)
    
    
    # note you can pass in multiple rows for scoring
    def run(raw_data):
        try:
            data = json.loads(raw_data)['data']
            data = numpy.array(data)
            result = model.predict(data)
            # Log the input and output data to appinsights:
            info = {
                "input": raw_data,
                "output": result.tolist()
                }
            print(json.dumps(info))
            # you can return any datatype as long as it is JSON-serializable
            return result.tolist()
        except Exception as e:
            error = str(e)
            print (error + time.strftime("%H:%M:%S"))
            return error
    
  2. 更新服务配置,并确保启用 Application Insights。Update the service configuration, and make sure to enable Application Insights.

    config = Webservice.deploy_configuration(enable_app_insights=True)
    
  3. 生成一个映像并将它部署到 AKS 或 ACI 上。Build an image and deploy it on AKS or ACI. 有关详细信息,请参阅部署方式及位置For more information, see How to deploy and where.

在 Python 中禁用跟踪Disable tracking in Python

若要禁用 Azure Application Insights,请使用以下代码:To disable Azure Application Insights, use the following code:

## replace <service_name> with the name of the web service
<service_name>.update(enable_app_insights=False)

使用 Azure 机器学习工作室配置日志记录Configure logging with Azure Machine Learning studio

还可以从 Azure 机器学习工作室启用 Azure Application Insights。You can also enable Azure Application Insights from Azure Machine Learning studio. 当你准备好将模型部署为 Web 服务时,请使用以下步骤启用 Application Insights:When you're ready to deploy your model as a web service, use the following steps to enable Application Insights:

  1. https://studio.ml.azure.cn/ 登录到工作室Sign in to the studio at https://studio.ml.azure.cn/

  2. 转到“模型”并选择要部署的模型。Go to Models and select the model you want to deploy.

  3. 选择“+部署”。Select +Deploy.

  4. 填充“部署模型”窗体。Populate the Deploy model form.

  5. 展开“高级”菜单。Expand the Advanced menu.

    “部署”窗体

  6. 选择“启用 Application Insights 诊断和数据收集”Select Enable Application Insights diagnostics and data collection

    启用 App Insights

查看指标和日志View metrics and logs

查询部署的模型的日志Query logs for deployed models

可以使用 get_logs() 函数从以前部署的 Web 服务检索日志。You can use the get_logs() function to retrieve logs from a previously deployed web service. 日志可以包含有关部署期间发生的任何错误的详细信息。The logs may contain detailed information about any errors that occurred during deployment.

from azureml.core.webservice import Webservice

# load existing web service
service = Webservice(name="service-name", workspace=ws)
logs = service.get_logs()

在工作室中查看日志View logs in the studio

Azure Application Insights 将服务日志存储在与 Azure 机器学习工作区相同的资源组中。Azure Application Insights stores your service logs in the same resource group as the Azure Machine Learning workspace. 按照以下步骤使用工作室查看数据:Use the following steps to view your data using the studio:

  1. 工作室中转到 Azure 机器学习工作区。Go to your Azure Machine Learning workspace in the studio.

  2. 选择“终结点”。Select Endpoints.

  3. 选择已部署的服务。Select the deployed service.

  4. 选择“Application Insights url”链接。Select the Application Insights url link.

    AppInsightsLocAppInsightsLoc

  5. 在 Application Insights 中,从“概述”选项卡或“监视”部分选择“日志” 。In Application Insights, from the Overview tab or the Monitoring section, select Logs.

    监视的“概述”选项卡Overview tab of monitoring

  6. 若要查看从 score.py 文件记录的信息,请查看跟踪表。To view information logged from the score.py file, look at the traces table. 以下查询搜索记录了输入值的日志:The following query searches for logs where the input value was logged:

    traces
    | where customDimensions contains "input"
    | limit 10
    

    跟踪数据trace data

有关如何使用 Azure Application Insights 的详细信息,请参阅什么是 Application InsightsFor more information on how to use Azure Application Insights, see What is Application Insights?.

Web 服务元数据和响应数据Web service metadata and response data

重要

Azure Application Insights 仅记录最多 64kb 的有效负载。Azure Application Insights only logs payloads of up to 64kb. 如果达到此限制,可能会出现内存不足或未记录任何信息等错误。If this limit is reached then you may see errors such as out of memory, or no information may be logged.

若要记录 Web 服务请求信息,请将 print 语句添加到 score.py 文件。To log web service request information, add print statements to your score.py file. 每个 print 语句都会在 Application Insights 跟踪表中的消息 STDOUT 下生成一个条目。Each print statement results in one entry in the Application Insights trace table under the message STDOUT. Application Insights 将 print 语句输出存储在 customDimensionsContents 跟踪表中。Application Insights stores the print statement outputs in customDimensions and in the Contents trace table. 打印 JSON 字符串会在 Contents 下的跟踪输出中生成分层数据结构。Printing JSON strings produces a hierarchical data structure in the trace output under Contents.

导出数据以进行保留和处理Export data for retention and processing

重要

Azure Application Insights 仅支持导出到 Blob 存储。Azure Application Insights only supports exports to blob storage. 有关此实现限制的详细信息,请参阅从 App Insights 导出遥测For more information on the limits of this implementation, see Export telemetry from App Insights.

使用 Application Insights 的连续导出将数据导出到 Blob 存储帐户,你可以在其中定义保留设置。Use Application Insights' continuous export to export data to a blob storage account where you can define retention settings. Application Insights 以 JSON 格式导出数据。Application Insights exports the data in JSON format.

连续导出

后续步骤Next steps

本文介绍了如何为 Web 服务终结点启用日志记录和查看日志。In this article, you learned how to enable logging and view logs for web service endpoints. 有关后续步骤,请尝试阅读以下文章:Try these articles for next steps: