建议用于 Apache Kafka 客户端的配置Recommended configurations for Apache Kafka clients

下面是从 Apache Kafka 客户端应用程序使用 Azure 事件中心时的建议配置。Here are the recommended configurations for using Azure Event Hubs from Apache Kafka client applications.

Java 客户端配置属性Java client configuration properties

生成者和使用者配置Producer and consumer configurations

propertiesProperty 建议的值Recommended values 允许的范围Permitted range 备注Notes
metadata.max.age.ms 180000(近似值)180000 (approximate) < 240000< 240000 可以降低,以便更快地获取元数据更改。Can be lowered to pick up metadata changes sooner.
connections.max.idle.ms 180000180000 < 240000< 240000 Azure 会关闭空闲时间 > 240000 毫秒的入站 TCP,这可能会导致在连接已“死”的情况下继续发送(因为发送超时而显示为过期的批)。Azure closes inbound TCP idle > 240,000 ms, which can result in sending on dead connections (shown as expired batches because of send timeout).

仅限生成者配置Producer configurations only

可在此处找到生成者配置。Producer configs can be found here.

propertiesProperty 建议的值Recommended Values 允许的范围Permitted Range 备注Notes
max.request.size 10000001000000 < 1046528< 1046528 如果发送的请求大于 1,046,528 字节,则服务会关闭连接。The service will close connections if requests larger than 1,046,528 bytes are sent. 此值必须更改,否则会导致高吞吐量生成场景中出现问题。This value must be changed and will cause issues in high-throughput produce scenarios.
retries > 0> 0 可能需要增大 delivery.timeout.ms 值,请参阅文档。May require increasing delivery.timeout.ms value, see documentation.
request.timeout.ms 30000 ..30000 .. 6000060000 > 20000> 20000 EH 会在内部默认设置为最小值,即 20,000 毫秒。EH will internally default to a minimum of 20,000 ms. 虽然系统会接受超时值较低的请求,但不保证客户端行为。While requests with lower timeout values are accepted, client behavior isn't guaranteed.
metadata.max.idle.ms 180000180000 > 5000> 5000 控制生成者为空闲的主题缓存元数据的时间长度。Controls how long the producer will cache metadata for a topic that's idle. 如果自上次生成主题以来过去的时间超过了元数据空闲持续时间,系统会忘记该主题的元数据,在下一次访问它时会强制执行元数据提取请求。If the elapsed time since a topic was last produced exceeds the metadata idle duration, then the topic's metadata is forgotten and the next access to it will force a metadata fetch request.
linger.ms > 0> 0 对于高吞吐量场景,逗留值应该等于最大的可容忍值,以便利用批处理。For high throughput scenarios, linger value should be equal to the highest tolerable value to take advantage of batching.
delivery.timeout.ms 请根据公式 (request.timeout.ms + linger.ms) * retries 进行设置。Set according to the formula (request.timeout.ms + linger.ms) * retries.
enable.idempotence falsefalse 目前不支持幂等。Idempotency currently not supported.
compression.type none 目前不支持压缩。Compression currently not supported..

仅限使用者配置Consumer configurations only

可在此处找到使用者配置。Consumer configs can be found here.

propertiesProperty 建议的值Recommended Values 允许的范围Permitted Range 备注Notes
heartbeat.interval.ms 30003000 3000 为默认值,不应更改。3000 is the default value and shouldn't be changed.
session.timeout.ms 3000030000 6000 ..6000 .. 300000300000 从 30000 开始。如果由于检测信号丢失而出现频繁地进行重新平衡的情况,请增大此值。Start with 30000, increase if seeing frequent rebalancing because of missed heartbeats.

librdkafka 配置属性librdkafka configuration properties

librdkafka 配置文件(链接)包含以下属性的扩展说明。The main librdkafka configuration file (link) contains extended descriptions for the properties below.

生成者和使用者配置Producer and consumer configurations

propertiesProperty 建议的值Recommended Values 允许的范围Permitted Range 备注Notes
socket.keepalive.enable true 如果预计连接会空闲,则此项为必需项。Necessary if connection is expected to idle. Azure 会关闭空闲时间 > 240,000 毫秒的入站 TCP。Azure will close inbound TCP idle > 240,000 ms.
metadata.max.age.ms ~ 180000~ 180000 < 240000< 240000 可以降低,以便更快地获取元数据更改。Can be lowered to pick up metadata changes sooner.

仅限生成者配置Producer configurations only

propertiesProperty 建议的值Recommended Values 允许的范围Permitted Range 备注Notes
retries > 0> 0 默认值为 2。Default is 2. 建议保留此值。We recommend that you keep this value.
request.timeout.ms 30000 ..30000 .. 6000060000 > 20000> 20000 EH 会在内部默认设置为最小值,即 20,000 毫秒。EH will internally default to a minimum of 20,000 ms. librdkafka 默认值为 5000,这可能会产生问题。librdkafka default value is 5000, which can be problematic. 虽然系统会接受超时值较低的请求,但不保证客户端行为。While requests with lower timeout values are accepted, client behavior isn't guaranteed.
partitioner consistent_random 请参阅 librdkafka 文档See librdkafka documentation consistent_random 是默认值,且为最佳值。consistent_random is default and best. 大多数情况下,可以理想地处理空键和 null 键。Empty and null keys are handled ideally for most cases.
enable.idempotence falsefalse 目前不支持幂等。Idempotency currently not supported.
compression.codec none 目前不支持压缩。Compression currently not supported.

仅限使用者配置Consumer configurations only

propertiesProperty 建议的值Recommended Values 允许的范围Permitted Range 备注Notes
heartbeat.interval.ms 30003000 3000 为默认值,不应更改。3000 is the default value and shouldn't be changed.
session.timeout.ms 3000030000 6000 ..6000 .. 300000300000 从 30000 开始。如果由于检测信号丢失而出现频繁地进行重新平衡的情况,请增大此值。Start with 30000, increase if seeing frequent rebalancing because of missed heartbeats.

更详尽的说明Further notes

请查看下表中与配置相关的常见错误情况。Check the following table of common configuration-related error scenarios.

症状Symptoms 问题Problem 解决方案Solution
由于重新平衡而导致偏移量提交失败Offset commit failures because of rebalancing 使用者在两次调用 poll() 之间等待的时间太长,因此服务将该使用者踢出了群组。Your consumer is waiting too long in between calls to poll() and the service is kicking the consumer out of the group. 有几种选项:You have several options:
  • 增大会话超时increase session timeout
  • 减小消息批大小以提高处理速度decrease message batch size to speed up processing
  • 改进处理并行性,以避免阻塞 consumer.poll()improve processing parallelization to avoid blocking consumer.poll()
应用这三者的某个组合可能是最明智的。Applying some combination of the three is likely wisest.
生成吞吐量高时的网络异常Network exceptions at high produce throughput 你是否正在使用 Java 客户端 + 默认的 max.request.size?Are you using Java client + default max.request.size? 你的请求可能太大。Your requests may be too large. 请参阅上述 Java 配置。See Java configs above.

后续步骤Next steps

有关所有 Azure 服务的配额和限制,请参阅 Azure 订阅和服务限制、配额和约束See Azure subscription and service limits, quotas, and constraints for quotas and limits of all Azure services.