来自 Azure 流分析的事件中心输出Event Hubs output from Azure Stream Analytics

Azure 事件中心服务是具有高扩展性的发布 - 订阅事件引入器。The Azure Event Hubs service is a highly scalable publish-subscribe event ingestor. 事件中心每秒可收集数百万个事件。It can collect millions of events per second. 当流分析作业的输出成为另一个流式处理作业的输入时,可以将事件中心用作输出。One use of an event hub as output is when the output of a Stream Analytics job becomes the input of another streaming job. 有关最大消息大小和批大小优化的信息,请参阅输出批大小部分。For information about the maximum message size and batch size optimization, see the output batch size section.

输出配置Output configuration

下表包含将事件中心内的数据流配置为输出所需的参数。The following table has the parameters needed to configure data streams from event hubs as an output.

属性名称Property name 说明Description
输出别名Output alias 查询中使用的易记名称,用于将查询输出定向到此事件中心。A friendly name used in queries to direct the query output to this event hub.
事件中心命名空间Event hub namespace 包含一组消息实体的容器。A container for a set of messaging entities. 创建新的事件中心后,还创建了事件中心命名空间。When you created a new event hub, you also created an event hub namespace.
事件中心名称Event hub name 事件中心输出的名称。The name of your event hub output.
事件中心策略名称Event hub policy name 共享访问策略,可以在事件中心的“配置”选项卡上创建。每个共享访问策略具有名称、所设权限以及访问密钥。The shared access policy, which you can create on the event hub's Configure tab. Each shared access policy has a name, permissions that you set, and access keys.
事件中心策略密钥Event hub policy key 用于对事件中心命名空间的访问权限进行身份验证的共享访问密钥。The shared access key that's used to authenticate access to the event hub namespace.
分区键列Partition key column 可选。Optional. 包含事件中心输出的分区键的列。A column that contains the partition key for event hub output.
事件序列化格式Event serialization format 输出数据的序列化格式。The serialization format for output data. 支持 JSON、CSV 和 Avro。JSON, CSV, and Avro are supported.
编码Encoding 对于 CSV 和 JSON,目前只支持 UTF-8 这种编码格式。For CSV and JSON, UTF-8 is the only supported encoding format at this time.
分隔符Delimiter 仅适用于 CSV 序列化。Applicable only for CSV serialization. 流分析支持大量的常见分隔符以对 CSV 格式的数据进行序列化。Stream Analytics supports a number of common delimiters for serializing data in CSV format. 支持的值为逗号、分号、空格、制表符和竖线。Supported values are comma, semicolon, space, tab, and vertical bar.
格式Format 仅适用于 JSON 序列化。Applicable only for JSON serialization. 分隔行指定通过新行分隔各个 JSON 对象,从而格式化输出。Line separated specifies that the output is formatted by having each JSON object separated by a new line. 如果选择“行分隔”,则读取 JSON 时,一次读取一个对象。If you select Line separated, the JSON is read one object at a time. 整个内容本身将不是有效的 JSON。The whole content by itself would not be a valid JSON. 数组指定输出会被格式化为 JSON 对象的数组。Array specifies that the output is formatted as an array of JSON objects.
属性列Property columns 可选。Optional. 需要作为传出消息(而不是有效负载)的用户属性附加的以逗号分隔的列。Comma-separated columns that need to be attached as user properties of the outgoing message instead of the payload. 输出的自定义元数据属性部分详细介绍了此功能。More information about this feature is in the section Custom metadata properties for output.

分区Partitioning

分区取决于分区对齐方式。Partitioning varies depending on partition alignment. 如果事件中心输出的分区键与上游(上一个)查询步骤不相符,写入器的数量与事件中心输出中的分区数量相同。When the partition key for event hub output is equally aligned with the upstream (previous) query step, the number of writers is the same as the number of partitions in event hub output. 每个写入器使用 EventHubSender 类将事件发送到特定分区。Each writer uses the EventHubSender class to send events to the specific partition. 如果事件中心输出的分区键与上游(上一个)查询步骤不相符,写入器的数量与该上一步骤中的分区数量相同。When the partition key for event hub output is not aligned with the upstream (previous) query step, the number of writers is the same as the number of partitions in that prior step. 每个写入器使用 EventHubClient 中的 SendBatchAsync 类将事件发送到所有输出分区。Each writer uses the SendBatchAsync class in EventHubClient to send events to all the output partitions.

输出批大小Output batch size

最大消息大小为每条消息 256 KB 或 1 MB。The maximum message size is 256 KB or 1 MB per message. 有关详细信息,请参阅事件中心限制For more information, see Event Hubs limits. 如果输入/输出分区未对齐,则每个事件将单独打包成不超过最大消息大小的 EventData,并在批中发送。When input/output partitioning isn't aligned, each event is packed individually in EventData and sent in a batch of up to the maximum message size. 如果使用自定义元数据属性,也会发生这种情况。This also happens if custom metadata properties are used. 如果输入/输出分区已对齐,则多个事件将打包成不超过最大消息大小的单个 EventData 实例,然后发送。When input/output partitioning is aligned, multiple events are packed into a single EventData instance, up to the maximum message size, and sent.

输出的自定义元数据属性Custom metadata properties for output

可将查询列作为用户属性附加到传出的消息。You can attach query columns as user properties to your outgoing messages. 这些列不会进入有效负载。These columns don't go into the payload. 这些属性以字典形式在输出消息中提供。The properties are present in the form of a dictionary on the output message. 键是列名,值是属性字典中的列值。 Key is the column name and value is the column value in the properties dictionary. 支持除“记录”和“数组”以外的其他所有流分析数据类型。All Stream Analytics data types are supported except Record and Array.

在以下示例中,字段 DeviceIdDeviceStatus 添加到了元数据。In the following example, the fields DeviceId and DeviceStatus are added to the metadata.

  1. 使用以下查询:Use the following query:

    select *, DeviceId, DeviceStatus from iotHubInput
    
  2. DeviceId,DeviceStatus 配置为输出中的属性列。Configure DeviceId,DeviceStatus as property columns in the output.

    属性列

下图显示了在事件中心使用服务总线资源管理器检查的预期输出消息属性。The following image is of the expected output message properties inspected in EventHub using Service Bus Explorer.

属性列

后续步骤Next steps