使用资源日志排查 Azure 流分析问题Troubleshoot Azure Stream Analytics by using resource logs

有时,Azure 流分析作业会意外地停止处理。Occasionally, an Azure Stream Analytics job unexpectedly stops processing. 因此,能够解决此类事件是很重要的。It's important to be able to troubleshoot this kind of event. 故障可能由意外的查询结果、与设备的连接问题或意外的服务中断导致。Failures can be caused by an unexpected query result, by connectivity to devices, or by an unexpected service outage. 流分析中的资源日志可以帮助用户在问题发生时确定原因并缩短恢复时间。The resource logs in Stream Analytics can help you identify the cause of issues when they occur and reduce recovery time.

强烈建议为所有作业启用资源日志,因为这对调试和监视会有很大帮助。It is highly recommended to enable resource logs for all jobs as this will greatly help with debugging and monitoring.

日志类型Log types

流分析提供两种类型的日志:Stream Analytics offers two types of logs:

  • 活动日志(始终在线),可深入了解对作业执行的操作。Activity logs (always on), which give insights into operations performed on jobs.

  • 资源日志(可配置),可详细了解作业发生的所有情况。Resource logs (configurable), which provide richer insights into everything that happens with a job. 资源日志在创建作业时开始,并在删除作业时结束。Resource logs start when the job is created and end when the job is deleted. 日志中包含了作业更新和运行期间的事件。They cover events when the job is updated and while it’s running.

备注

可以使用 Azure 存储、Azure 事件中心和 Azure Monitor 日志等服务分析不一致的数据。You can use services like Azure Storage, Azure Event Hubs, and Azure Monitor logs to analyze nonconforming data. 将根据这些服务的定价模式进行收费。You are charged based on the pricing model for those services.

备注

本文最近已更新,从使用术语“Log Analytics”改为使用术语“Azure Monitor 日志”。This article was recently updated to use the term Azure Monitor logs instead of Log Analytics. 日志数据仍然存储在 Log Analytics 工作区中,并仍然由同一 Log Analytics 服务收集并分析。Log data is still stored in a Log Analytics workspace and is still collected and analyzed by the same Log Analytics service. 我们正在更新术语,以便更好地反映 Azure Monitor 中日志的角色。We are updating the terminology to better reflect the role of logs in Azure Monitor. 有关详细信息,请参阅 Azure Monitor 术语更改See Azure Monitor terminology changes for details.

使用活动日志调试Debugging using activity logs

活动日志在默认情况下处于启用状态,提供对流分析作业执行的操作的深入见解。Activity logs are on by default and give high-level insights into operations performed by your Stream Analytics job. 活动日志中存在的信息可帮助找到影响作业的问题的根本原因。Information present in activity logs may help find the root cause of the issues impacting your job. 执行以下步骤,在流分析中使用活动日志:Do the following steps to use activity logs in Stream Analytics:

  1. 登录 Azure 门户并选择“概述”下的“活动日志” 。Sign in to the Azure portal and select Activity log under Overview.

    流分析活动日志

  2. 可看到已执行的操作的列表。You can see a list of operations that have been performed. 导致作业失败的所有操作都有一个红色信息气泡。Any operation that caused your job to fail has a red info bubble.

  3. 单击操作以查看其摘要视图。Click an operation to see its summary view. 此处的信息通常是有限的。Information here is often limited. 若要了解有关操作的更多详细信息,请单击“JSON”。To learn more details about the operation, click JSON.

    流分析活动日志操作摘要

  4. 向下滚动到 JSON 的“属性”部分,其中提供导致失败操作的错误的详细信息。Scroll down to the Properties section of the JSON, which provides details of the error that caused the failed operation. 在本示例中,失败的原因在于超出范围的纬度值的运行时错误。In this example, the failure was due to a runtime error from out of bound latitude values. 流分析作业处理的数据不一致会导致数据错误。Discrepancy in the data that is processed by a Stream Analytics job causes a data error. 你可以了解不同的输入和输出数据错误及其发生原因You can learn about different input and output data errors and why they occur.

    JSON 错误详细信息

  5. 可以根据 JSON 中的错误消息采取纠正措施。You can take corrective actions based on the error message in JSON. 在本示例中,检查以确保纬度值介于 -90 度到 90 度之间,并需要将其添加到查询中。In this example, checks to ensure latitude value is between -90 degrees and 90 degrees need to be added to the query.

  6. 如果活动日志中的错误消息对于确定根本原因没有帮助,请启用资源日志并使用 Azure Monitor 日志。If the error message in the Activity logs isn’t helpful in identifying root cause, enable resource logs and use Azure Monitor logs.

将诊断发送到 Azure Monitor 日志Send diagnostics to Azure Monitor logs

强烈建议打开资源日志并将它们发送到 Azure Monitor 日志。Turning on resource logs and sending them to Azure Monitor logs is highly recommended. 默认情况下,它们处于“关闭”状态。They are off by default. 若要打开它们,请完成以下步骤:To turn them on, complete these steps:

  1. 如果你还没有 Log Analytics 工作区,请创建一个。Create a Log Analytics workspace if you don't already have one. 建议将 Log Analytics 工作区与流分析作业位于同一区域中。It is recommended to have your Log Analytics workspace in the same region as your Stream Analytics job.

  2. 登录 Azure 门户,导航到流分析作业。Sign in to the Azure portal, and navigate to your Stream Analytics job. 在“监视”下,选择“诊断设置” 。Under Monitoring, select Diagnostics settings. 然后选择“启用诊断”。Then select Turn on diagnostics.

    在边栏选项卡中导航到资源日志

  3. 在“诊断设置名称”中提供“名称”,并选中“日志”下的“执行”和“授权”复选框,以及“指标”下的“AllMetrics”复选框 。Provide a Name in Diagnostic settings name and check the boxes for Execution and Authoring under log, and AllMetrics under metric. 然后选择“发送到 Log Analytics”并选择工作区。Then select Send to Log Analytics and choose your workspace. 单击“ 保存”。Click Save.

    资源日志设置

  4. 流分析作业开始时,资源日志会被路由到 Log Analytics 工作区。When your Stream Analytics job starts, resource logs are routed to your Log Analytics workspace. 若要查看作业的资源日志,请在“监视”部分下选择“日志” 。To view resource logs for your job, select Logs under the Monitoring section.

    显示“常规”菜单的屏幕截图,其中已选择“日志”

  5. 流分析提供预定义的查询,使你可以轻松搜索感兴趣的日志。Stream Analytics provides pre-defined queries that allows you to easily search for the logs that you are interested in. 可以在左侧窗格中选择任意预定义的查询,然后选择“运行”。You can select any pre-defined queries on the left pane and then select Run. 底部窗格将显示查询结果。You will see the results of the query in the bottom pane.

    显示流分析作业的“日志”的屏幕截图。

资源日志类别Resource log categories

Azure 流分析捕获两种类别的资源日志:Azure Stream Analytics captures two categories of resource logs:

  • 创作:捕获与作业创作操作相关的日志事件,例如作业创建、添加和删除输入与输出、添加和更新查询,以及开始或停止作业。Authoring: Captures log events that are related to job authoring operations, such as job creation, adding and deleting inputs and outputs, adding and updating the query, and starting or stopping the job.

  • 执行:捕获作业执行期间发生的事件。Execution: Captures events that occur during job execution.

    • 连接错误Connectivity errors
    • 数据处理错误,包括:Data processing errors, including:
      • 不符合查询定义的事件(字段类型和值不匹配、缺少字段等)Events that don’t conform to the query definition (mismatched field types and values, missing fields, and so on)
      • 表达式计算错误Expression evaluation errors
    • 其他事件和错误Other events and errors

资源日志架构Resource logs schema

所有日志均以 JSON 格式存储。All logs are stored in JSON format. 每个项目均具有以下常见字符串字段:Each entry has the following common string fields:

名称Name 说明Description
timetime 日志时间戳(采用 UTC)。Timestamp (in UTC) of the log.
ResourceIdresourceId 发生操作的资源的 ID,采用大写格式。ID of the resource that the operation took place on, in upper case. 其中包括订阅 ID、资源组和作业名称。It includes the subscription ID, the resource group, and the job name. 例如, /SUBSCRIPTIONS/6503D296-DAC1-4449-9B03-609A1F4A1C87/RESOURCEGROUPS/MY-RESOURCE-GROUP/PROVIDERS/MICROSOFT.STREAMANALYTICS/STREAMINGJOBS/MYSTREAMINGJOBFor example, /SUBSCRIPTIONS/6503D296-DAC1-4449-9B03-609A1F4A1C87/RESOURCEGROUPS/MY-RESOURCE-GROUP/PROVIDERS/MICROSOFT.STREAMANALYTICS/STREAMINGJOBS/MYSTREAMINGJOB.
categorycategory 日志类别,“执行”或“创作”。Log category, either Execution or Authoring.
operationNameoperationName 被记录的操作的名称。Name of the operation that is logged. 例如,发送事件:SQL 输出写入到 mysqloutput 失败For example, Send Events: SQL Output write failure to mysqloutput.
状态status 操作的状态。Status of the operation. 例如,“失败”或“成功”。For example, Failed or Succeeded.
levellevel 日志级别。Log level. 例如,“错误”、“警告”或“信息性消息”。For example, Error, Warning, or Informational.
propertiesproperties 日志项目的具体详细信息;序列化为 JSON 字符串。Log entry-specific detail, serialized as a JSON string. 有关详细信息,请参阅本文的以下部分。For more information, see the following sections in this article.

执行日志属性架构Execution log properties schema

执行日志包含有关执行流分析作业期间发生的事件的信息。Execution logs have information about events that happened during Stream Analytics job execution. 属性的架构根据事件是数据错误还是一般事件而有所不同。The schema of properties varies depending on whether the event is a data error or a generic event.

数据错误Data errors

作业处理数据期间出现的任何错误都在此日志类别中。Any error that occurs while the job is processing data is in this category of logs. 这些日志通常创建于读取数据、序列化和写入操作期间。These logs most often are created during data read, serialization, and write operations. 这些日志不包括连接错误。These logs do not include connectivity errors. 连接错误被视为泛型事件。Connectivity errors are treated as generic events. 你可以详细了解各种输入和输出数据错误的原因。You can learn more about the cause of various different input and output data errors.

名称Name 说明Description
SourceSource 发生错误的作业输入或输出的名称。Name of the job input or output where the error occurred.
MessageMessage 与错误关联的消息。Message associated with the error.
类型Type 错误类型。Type of error. 例如,DataConversionError、CsvParserError 和 ServiceBusPropertyColumnMissingError 。For example, DataConversionError, CsvParserError, or ServiceBusPropertyColumnMissingError.
数据Data 包含用于准确找到错误起源的数据。Contains data that is useful to accurately locate the source of the error. 会根据数据大小截断数据。Subject to truncation, depending on size.

数据错误根据 operationName 值采用以下架构:Depending on the operationName value, data errors have the following schema:

  • 事件读取操作期间会发生 序列化事件Serialize events occur during event read operations. 当输入的数据由以下任一原因而不满足查询架构时会发生此类事件:They occur when the data at the input does not satisfy the query schema for one of these reasons:

    • 事件序列化/反序列化期间类型不匹配:标识导致出错的字段。Type mismatch during event (de)serialize: Identifies the field that's causing the error.

    • 无法读取事件,序列化无效:列出输入数据中发生错误的相关位置信息。Cannot read an event, invalid serialization: Lists information about the location in the input data where the error occurred. 包括用于 blob 输入的 blob 名称、偏移量和数据示例。Includes blob name for blob input, offset, and a sample of the data.

  • 写入操作期间发生 发送事件Send events occur during write operations. 它们标识导致错误的流式处理事件。They identify the streaming event that caused the error.

泛型事件Generic events

泛型事件包含其他所有情况。Generic events cover everything else.

名称Name 说明Description
错误Error (可选)错误信息。(optional) Error information. 通常情况下,这是异常信息(如果存在)。Usually, this is exception information if it's available.
MessageMessage 日志消息。Log message.
类型Type 消息类型。Type of message. 映射到错误的内部分类。Maps to internal categorization of errors. 例如,JobValidationError 或 BlobOutputAdapterInitializationFailure。For example, JobValidationError or BlobOutputAdapterInitializationFailure.
相关性 IDCorrelation ID GUID 。GUID that uniquely identifies the job execution. 从作业开始到作业停止期间所有的执行日志条目具有相同的“相关 ID”值。All execution log entries from the time the job starts until the job stops have the same Correlation ID value.

后续步骤Next steps