排查输入连接问题Troubleshoot input connections

本文介绍 Azure 流分析输入连接的常见问题,以及如何排查和解决这些问题。This article describes common issues with Azure Stream Analytics input connections, how to troubleshoot input issues, and how to correct the issues.

作业未收到输入事件Input events not received by job

  1. 测试输入和输出连接。Test your input and output connectivity. 使用每项输入和输出对应的“测试连接”按钮来验证与输入和输出的连接。Verify connectivity to inputs and outputs by using the Test Connection button for each input and output.

  2. 检查输入数据。Examine your input data.

    1. 对每个输出使用示例数据按钮。Use the Sample Data button for each input. 下载输入示例数据。Download the input sample data.

    2. 检查示例数据,了解架构和数据类型Inspect the sample data to understand the schema and data types.

    3. 检查事件中心指标,确保正在发送事件。Check Event Hub metrics to ensure events are being sent. 如果事件中心正在接收消息,则消息指标应大于零。Message metrics should be greater than zero if Event Hubs is receiving messages.

  3. 确保在输入预览中选择了时间范围。Ensure that you have selected a time range in the input preview. 选择“选择时间范围”,输入示例持续时间,然后测试查询。Choose Select time range, and then enter a sample duration before testing your query.

格式不正确的输入事件导致反序列化错误Malformed input events causes deserialization errors

当流分析作业的输入流包含格式不当的消息时,会导致反序列化问题。Deserialization issues are caused when the input stream of your Stream Analytics job contains malformed messages. 例如,JSON 对象中缺少圆括号或大括号,或者时间字段中的时间戳格式不当,都可能导致消息格式不当。For example, a malformed message could be caused by a missing parenthesis, or brace, in a JSON object or an incorrect timestamp format in the time field.

当流分析作业从某个输入收到格式不当的消息时,它会丢弃该消息并通过警告来通知你。When a Stream Analytics job receives a malformed message from an input, it drops the message and notifies you with a warning. 流分析作业的“输入”磁贴上会显示一个警告符号。A warning symbol is shown on the Inputs tile of your Stream Analytics job. 只要作业处于运行状态,以下警告符号就会存在:The following warning symbol exists as long as the job is in running state:

Azure 流分析输入磁贴

启用资源日志可查看错误的详细信息以及导致错误的消息(有效负载)。Enable resource logs to view the details of the error and the message (payload) that caused the error. 有多种原因会导致反序列化错误发生。There are multiple reasons why deserialization errors can occur. 有关特定反序列化错误的详细信息,请参阅输入数据错误For more information regarding specific deserialization errors, see Input data errors. 如果未启用资源日志,Azure 门户中将提供一个简短通知。If resource logs are not enabled, a brief notification will be available in the Azure portal.

输入详细信息警告通知

如果消息有效负载大于 32 KB 或采用二进制格式,请运行 GitHub 示例存储库中提供的 CheckMalformedEvents.cs 代码。In cases where the message payload is greater than 32 KB or is in binary format, run the CheckMalformedEvents.cs code available in the GitHub samples repository. 此代码读取分区 ID、偏移量并列显位于该偏移位置的数据。This code reads the partition ID, offset, and prints the data that's located in that offset.

作业超出了事件中心接收器的最大数量Job exceeds maximum Event Hub receivers

使用事件中心时,最佳做法是使用多个使用者组来确保作业的可伸缩性。A best practice for using Event Hubs is to use multiple consumer groups for job scalability. 对于特定的输入,流分析作业中读取器的数量会影响单个使用者组中的读取器数量。The number of readers in the Stream Analytics job for a specific input affects the number of readers in a single consumer group. 接收器的精确数量取决于横向扩展的拓扑逻辑的内部实现详细信息并且不对外公开。The precise number of receivers is based on internal implementation details for the scale-out topology logic and is not exposed externally. 读取器的数量会在作业启动时或作业升级期间发生更改。The number of readers can change when a job is started or during job upgrades.

当接收器数量超过最大数量时,将显示以下错误消息。The following error messages are shown when the number of receivers exceeds the maximum. 错误消息包含使用者组下与事件中心建立的现有连接的列表。The error message includes a list of existing connections made to Event Hub under a consumer group. 标记 AzureStreamAnalytics 指示连接来自 Azure 流式处理服务。The tag AzureStreamAnalytics indicates that the connections are from Azure Streaming Service.

The streaming job failed: Stream Analytics job has validation errors: Job will exceed the maximum amount of Event Hub Receivers.

The following information may be helpful in identifying the connected receivers: Exceeded the maximum number of allowed receivers per partition in a consumer group which is 5. List of connected receivers - 
AzureStreamAnalytics_c4b65e4a-f572-4cfc-b4e2-cf237f43c6f0_1, 
AzureStreamAnalytics_c4b65e4a-f572-4cfc-b4e2-cf237f43c6f0_1, 
AzureStreamAnalytics_c4b65e4a-f572-4cfc-b4e2-cf237f43c6f0_1, 
AzureStreamAnalytics_c4b65e4a-f572-4cfc-b4e2-cf237f43c6f0_1, 
AzureStreamAnalytics_c4b65e4a-f572-4cfc-b4e2-cf237f43c6f0_1.

备注

当读取器的数量在作业升级期间发生更改时,暂时性警告会被写入到审核日志中。When the number of readers changes during a job upgrade, transient warnings are written to audit logs. 在发生这些暂时性问题后,流分析作业会自动恢复。Stream Analytics jobs automatically recover from these transient issues.

在事件中心内添加使用者组Add a consumer group in Event Hubs

若要在事件中心实例内添加新的使用者组,请执行以下步骤:To add a new consumer group in your Event Hubs instance, follow these steps:

  1. 登录到 Azure 门户。Sign in to the Azure portal.

  2. 找到你的事件中心。Locate your Event Hub.

  3. 选择“实体”标题下的“事件中心”。 Select Event Hubs under the Entities heading.

  4. 通过名称选择事件中心。Select the Event Hub by name.

  5. 在“事件中心实例”页面上,在“实体”标题下,选择“使用者组”。 On the Event Hubs Instance page, under the Entities heading, select Consumer groups. 此时将列出名为 $Default 的使用者组。A consumer group with name $Default is listed.

  6. 选择“+ 使用者组”添加新的使用者组。Select + Consumer Group to add a new consumer group.

    在事件中心内添加使用者组

  7. 在流分析作业中创建输入以指向事件中心时,你在那里指定了使用者组。When you created the input in the Stream Analytics job to point to the Event Hub, you specified the consumer group there. 在未指定的情况下,会使用 $Default。$Default is used when none is specified. 在创建新的使用者组后,编辑流分析作业中的事件中心输入并指定新使用者组的名称。Once you create a new consumer group, edit the Event Hub input in the Stream Analytics job and specify the name of the new consumer group.

每个分区的读取器数超过事件中心限制Readers per partition exceeds Event Hubs limit

如果流式处理查询语法多次引用了同一输入事件中心资源,则作业引擎可以为每个查询使用来自该同一使用者组的多个读取器。If your streaming query syntax references the same input Event Hub resource multiple times, the job engine can use multiple readers per query from that same consumer group. 当存在对同一使用者组的太多引用时,作业可能会超出限制数 5 并引发错误。When there are too many references to the same consumer group, the job can exceed the limit of five and thrown an error. 在这些情况下,可以通过使用以下部分中介绍的解决方案,对多个使用者组使用多个输入,从而进行进一步划分。In those circumstances, you can further divide by using multiple inputs across multiple consumer groups using the solution described in the following section.

每个分区的读取器数超过数据中心限制(5 个)的情况如下:Scenarios in which the number of readers per partition exceeds the Event Hubs limit of five include the following:

  • 多个 SELECT 语句:如果使用引用“同一个”事件中心输入的多个 SELECT 语句,则每个 SELECT 语句都将导致新建一个接收器。Multiple SELECT statements: If you use multiple SELECT statements that refer to same event hub input, each SELECT statement causes a new receiver to be created.

  • UNION:使用 UNION 时,可能存在引用“同一个”事件中心或使用者组的多个输入。UNION: When you use a UNION, it's possible to have multiple inputs that refer to the same event hub and consumer group.

  • SELF JOIN:使用 SELF JOIN 操作时,可能会多次引用“同一个”事件中心。SELF JOIN: When you use a SELF JOIN operation, it's possible to refer to the same event hub multiple times.

下列最佳做法可帮助缓解每个分区的读取器数超过数据中心限制(5 个)的情况。The following best practices can help mitigate scenarios in which the number of readers per partition exceeds the Event Hubs limit of five.

使用 WITH 子句将查询拆分为多个步骤Split your query into multiple steps by using a WITH clause

WITH 子句将指定可由查询中的 FROM 子句引用的临时命名结果集。The WITH clause specifies a temporary named result set that can be referenced by a FROM clause in the query. 在单个 SELECT 语句的执行范围内定义 WITH 子句。You define the WITH clause in the execution scope of a single SELECT statement.

例如,与其使用此查询:For example, instead of this query:

SELECT foo 
INTO output1
FROM inputEventHub

SELECT bar
INTO output2
FROM inputEventHub 
…

不如使用此查询:Use this query:

WITH data AS (
   SELECT * FROM inputEventHub
)

SELECT foo
INTO output1
FROM data

SELECT bar
INTO output2
FROM data
…

确保输入绑定到不同的使用者组Ensure that inputs bind to different consumer groups

对于有三个或三个以上输入连接到同一事件中心的查询,请创建单独的使用者组。For queries in which three or more inputs are connected to the same Event Hubs consumer group, create separate consumer groups. 这需要创建额外的流分析输入。This requires the creation of additional Stream Analytics inputs.

使用不同的使用者组创建不同的输入Create separate inputs with different consumer groups

你可以使用不同的使用者组为同一事件中心创建不同的输入。You can create separate inputs with different consumer groups for the same Event Hub. 下面的 UNION 查询是一个示例,其中 InputOne 和 InputTwo 指代同一事件中心源 。The following UNION query is an example where InputOne and InputTwo refer to the same Event Hub source. 任何查询都可以使用不同的使用者组创建不同的输入Any query can have separate inputs with different consumer groups. UNION 查询只是一个示例。The UNION query is only one example.

WITH 
DataOne AS 
(
SELECT * FROM InputOne 
),

DataTwo AS 
(
SELECT * FROM InputTwo 
),

SELECT foo FROM DataOne
UNION 
SELECT foo FROM DataTwo

每个分区的读取器数超过 IoT 中心限制Readers per partition exceeds IoT Hub limit

流分析作业使用 IoT 中心内置的事件中心集线器兼容终结点从 IoT 中心连接和读取事件。Stream Analytics jobs use IoT Hub's built-in Event Hub compatible endpoint to connect and read events from IoT Hub. 如果每个分区的读取数超过了 IoT 中心的限制,则可以使用事件中心的解决方案来解决它。If your read per partition exceeds the limits of IoT Hub, you can use the solutions for Event Hub to resolve it. 可以通过 IoT 中心门户终结点会话或通过IoT 中心 SDK为内置终结点创建使用者组。You can create a consumer group for the built-in endpoint through IoT Hub portal endpoint session or through the IoT Hub SDK.

获取帮助Get help

如需获取进一步的帮助,可前往 Azure 流分析的 Microsoft 问答页面For further assistance, try our Microsoft Q&A question page for Azure Stream Analytics.

后续步骤Next steps