Azure 流分析数据错误Azure Stream Analytics data errors

数据错误是处理数据时发生的错误。Data errors are errors that occur while processing the data. 这些错误往往发生在数据反序列化、序列化和写入操作期间。These errors most often occur during data de-serialization, serialization, and write operations. 发生数据错误时,流分析会将详细信息和示例事件写入资源日志。When data errors occur, Stream Analytics writes detailed information and example events to the resource logs. 在作业中启用诊断日志以获取这些其他详细信息。Enable diagnostic logs in your job to get these additional details. 在某些情况下,还会通过门户通知来提供此信息的摘要。In some cases, a summary of this information is also provided through portal notifications.

本文概述输入和输出数据错误的不同错误类型、原因和资源日志详细信息。This article outlines the different error types, causes, and resource log details for input and output data errors.

资源日志架构Resource Logs schema

以下 JSON 是数据错误资源日志的“属性”字段示例值****。The following JSON is an example value for the Properties field of a resource log for a data error.

{
    "Source": "InputTelemetryData",
    "Type": "DataError",
    "DataErrorType": "InputDeserializerError.InvalidData",
    "BriefMessage": "Json input stream should either be an array of objects or line separated objects. Found token type: Integer",
    "Message": "Input Message Id: https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt Error: Json input stream should either be an array of objects or line separated objects. Found token type: Integer",
    "ExampleEvents": "[\"1,2\\\\u000d\\\\u000a3,4\\\\u000d\\\\u000a5,6\"]",
    "FromTimestamp": "2019-03-22T22:34:18.5664937Z",
    "ToTimestamp": "2019-03-22T22:34:18.5965248Z",
    "EventCount": 1
}

输入数据错误Input data errors

InputDeserializerError.InvalidCompressionTypeInputDeserializerError.InvalidCompressionType

  • 原因:所选的输入压缩类型与数据不匹配。Cause: The input compression type selected doesn't match the data.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除存在任何反序列化错误(包括无效的压缩类型)的消息。Impact: Messages with any deserialization errors including invalid compression type are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier. 对于事件中心,标识符为 PartitionId、偏移量和序列号。For Event Hub, the identifier is the PartitionId, Offset, and Sequence Number.

错误消息Error message

"BriefMessage": "Unable to decompress events from resource 'https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt'. Please ensure compression setting fits the data being processed."

InputDeserializerError.InvalidHeaderInputDeserializerError.InvalidHeader

  • 原因:输入数据的标头无效。Cause: The header of input data is invalid. 例如,CSV 包含具有重复名称的列。For example, a CSV has columns with duplicate names.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除存在任何反序列化错误(包括无效的标头)的消息。Impact: Messages with any deserialization errors including invalid header are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier.
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error message

"BriefMessage": "Invalid CSV Header for resource 'https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt'. Please make sure there are no duplicate field names."

InputDeserializerError.MissingColumnsInputDeserializerError.MissingColumns

  • 原因:使用 CREATE TABLE 或通过 TIMESTAMP BY 定义的输入列不存在。Cause: The input columns defined with CREATE TABLE or through TIMESTAMP BY doesn't exist.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除缺少列的事件。Impact: Events with missing columns are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier.
    • 缺少的列的名称。Names of the columns that are missing.
    • 最大若干 KB 的实际有效负载。Actual payload up to a few kilobytes.

错误消息Error messages

"BriefMessage": "Could not deserialize the input event(s) from resource 'https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt' as Csv. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format" 
"Message": "Missing fields specified in query or in create table. Fields expected:ColumnA Fields found:ColumnB"

InputDeserializerError.TypeConversionErrorInputDeserializerError.TypeConversionError

  • 原因:无法将输入转换为 CREATE TABLE 语句中指定的类型。Cause: Unable to convert the input to the type specified in the CREATE TABLE statement.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除存在类型转换错误的事件。Impact: Events with type conversion error are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier.
    • 列名称和预期类型。Name of the column and expected type.

错误消息Error messages

"BriefMessage": "Could not deserialize the input event(s) from resource '''https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt ' as Csv. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format" 
"Message": "Unable to convert column: dateColumn to expected type."

InputDeserializerError.InvalidDataInputDeserializerError.InvalidData

  • 原因:输入数据的格式不正确。Cause: Input data is not in the right format. 例如,输入不是有效的 JSON。For example, the input isn't valid JSON.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除遇到无效数据错误后显示的消息中的所有事件。Impact: All events in the message after an invalid data error has been encountered are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier.
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error messages

"BriefMessage": "Json input stream should either be an array of objects or line separated objects. Found token type: String"
"Message": "Json input stream should either be an array of objects or line separated objects. Found token type: String"

InvalidInputTimeStampInvalidInputTimeStamp

  • 原因:TIMESTAMP BY 表达式的值无法转换为日期时间。Cause: The value of the TIMESTAMP BY expression can't be converted to datetime.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除存在无效输入时间戳的事件。Impact: Events with invalid input timestamp are dropped from the input.
  • 日志详细信息Log details
    • 输入消息标识符。Input message identifier.
    • 错误消息。Error message.
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error message

"BriefMessage": "Unable to get timestamp for resource 'https:\\/\\/exampleBlob.blob.core.chinacloudapi.cn\\/inputfolder\\/csv.txt ' due to error 'Cannot convert string to datetime'"

InvalidInputTimeStampKeyInvalidInputTimeStampKey

  • 原因:TIMESTAMP BY OVER timestampColumn 的值为 NULL。Cause: The value of TIMESTAMP BY OVER timestampColumn is NULL.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将从输入中删除存在无效输入时间戳键的事件。Impact: Events with invalid input timestamp key are dropped from the input.
  • 日志详细信息Log details
    • 最大若干 KB 的实际有效负载。The actual payload up to few kilobytes.

错误消息Error message

"BriefMessage": "Unable to get value of TIMESTAMP BY OVER COLUMN"

LateInputEventLateInputEvent

  • 原因:应用程序时间与抵达时间之间的差大于延期抵达容限时限。Cause: The difference between application time and arrival time is greater than late arrival tolerance window.
  • 提供的门户通知:否Portal notification provided: No
  • 资源日志级别:信息Resource log level: Information
  • 影响:将会根据作业配置的“事件排序”部分中的“处理其他事件”设置来处理延期输入事件。Impact: Late input events are handled according to the "Handle other events" setting in the Event Ordering section of the job configuration. 有关详细信息,请参阅时间处理策略For more information see Time Handling Policies.
  • 日志详细信息Log details
    • 应用程序时间和抵达时间。Application time and arrival time.
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error message

"BriefMessage": "Input event with application timestamp '2019-01-01' and arrival time '2019-01-02' was sent later than configured tolerance."

EarlyInputEventEarlyInputEvent

  • 原因:应用程序时间与抵达时间之间的差大于 5 分钟。Cause: The difference between Application time and Arrival time is greater than 5 minutes.
  • 提供的门户通知:否Portal notification provided: No
  • 资源日志级别:信息Resource log level: Information
  • 影响:将会根据作业配置的“事件排序”部分中的“处理其他事件”设置来处理提前输入事件。Impact: Early input events are handled according to the "Handle other events" setting in the Event Ordering section of the job configuration. 有关详细信息,请参阅时间处理策略For more information see Time Handling Policies.
  • 日志详细信息Log details
    • 应用程序时间和抵达时间。Application time and arrival time.
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error message

"BriefMessage": "Input event arrival time '2019-01-01' is earlier than input event application timestamp '2019-01-02' by more than 5 minutes."

OutOfOrderEventOutOfOrderEvent

  • 原因:根据定义的失序容限时限将事件视为失序。Cause: Event is considered out of order according to the out of order tolerance window defined.
  • 提供的门户通知:否Portal notification provided: No
  • 资源日志级别:信息Resource log level: Information
  • 影响:将会根据作业配置的“事件排序”部分中的“处理其他事件”设置来处理失序事件。Impact: Out of order events are handled according to the "Handle other events" setting in the Event Ordering section of the job configuration. 有关详细信息,请参阅时间处理策略For more information see Time Handling Policies.
  • 日志详细信息Log details
    • 最大若干 KB 的实际有效负载。Actual payload up to few kilobytes.

错误消息Error message

"Message": "Out of order event(s) received."

输出数据错误Output data errors

OutputDataConversionError.RequiredColumnMissingOutputDataConversionError.RequiredColumnMissing

  • 原因:输出所需的列不存在。Cause: The column required for the output doesn't exist. 例如,定义为 Azure 表 PartitionKey 的列不存在。For example, a column defined as Azure Table PartitionKey does't exist.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将会根据输出数据策略设置处理所有输出数据转换错误,包括缺少必需的列。Impact: All output data conversion errors including missing required column are handled according to the Output Data Policy setting.
  • 日志详细信息Log details
    • 列名称,以及记录标识符或记录部分。Name of the column and either the record identifier or part of the record.

错误消息Error message

"Message": "The output record does not contain primary key property: [deviceId] Ensure the query output contains the column [deviceId] with a unique non-empty string less than '255' characters."

OutputDataConversionError.ColumnNameInvalidOutputDataConversionError.ColumnNameInvalid

  • 原因:列值不符合输出。Cause: The column value doesn't conform with the output. 例如,列名称不是有效的 Azure 表列。For example, the column name isn't a valid Azure table column.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将会根据输出数据策略设置处理所有输出数据转换错误,包括无效的列名称。Impact: All output data conversion errors including invalid column name are handled according to the Output Data Policy setting.
  • 日志详细信息Log details
    • 列名称,以及记录标识符或记录部分。Name of the column and either record identifier or part of the record.

错误消息Error message

"Message": "Invalid property name #deviceIdValue. Please refer MSDN for Azure table property naming convention."

OutputDataConversionError.TypeConversionErrorOutputDataConversionError.TypeConversionError

  • 原因:列无法转换为输出中的有效类型。Cause: A column can't be converted to a valid type in the output. 例如,列的值与 SQL 表中定义的约束或类型不兼容。For example, the value of column is incompatible with constraints or type defined in SQL table.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将会根据输出数据策略设置处理所有输出数据转换错误,包括类型转换错误。Impact: All output data conversion errors including type conversion error are handled according to the Output Data Policy setting.
  • 日志详细信息Log details
    • 列的名称。Name of the column.
    • 记录标识符或记录部分。Either record identifier or part of the record.

错误消息Error message

"Message": "The column [id] value null or its type is invalid. Ensure to provide a unique non-empty string less than '255' characters."

OutputDataConversionError.RecordExceededSizeLimitOutputDataConversionError.RecordExceededSizeLimit

  • 原因:消息的值大于支持的输出大小。Cause: The value of the message is greater than the supported output size. 例如,事件中心输出的记录大于 1 MB。For example, a record is larger than 1 MB for an Event Hub output.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将会根据输出数据策略设置处理所有输出数据转换错误,包括记录超过大小限制。Impact: All output data conversion errors including record exceeded size limit are handled according to the Output Data Policy setting.
  • 日志详细信息Log details
    • 记录标识符或记录部分。Either record identifier or part of the record.

错误消息Error message

"BriefMessage": "Single output event exceeds the maximum message size limit allowed (262144 bytes) by Event Hub."

OutputDataConversionError.DuplicateKeyOutputDataConversionError.DuplicateKey

  • 原因:记录中已包含与 System 列同名的列。Cause: A record already contains a column with the same name as a System column. 例如,CosmosDB 输出中包含一个名为 ID 的列,而另外还有一个 ID 列。For example, CosmosDB output with a column named ID when ID column is to a different column.
  • 提供的门户通知:是Portal notification provided: Yes
  • 资源日志级别:警告Resource log level: Warning
  • 影响:将会根据输出数据策略设置处理所有输出数据转换错误,包括重复的键。Impact: All output data conversion errors including duplicate key are handled according to the Output Data Policy setting.
  • 日志详细信息Log details
    • 列的名称。Name of the column.
    • 记录标识符或记录部分。Either record identifier or part of the record.
"BriefMessage": "Column 'devicePartitionKey' is being mapped to multiple columns."

后续步骤Next steps