Apache Kafka HDInsight 群集的性能优化Performance optimization for Apache Kafka HDInsight clusters

本文提供有关优化 HDInsight 中 Apache Kafka 工作负荷的性能的一些建议,This article gives some suggestions for optimizing the performance of your Apache Kafka workloads in HDInsight. 并重点介绍如何调整生成者和代理配置。The focus is on adjusting producer and broker configuration. 可通过不同的方法来衡量性能,运用的优化方法取决于业务需求。There are different ways of measuring performance, and the optimizations that you apply will depend on your business needs.

体系结构概述Architecture overview

Kafka 主题用于对记录进行组织。Kafka topics are used to organize records. 记录由生成者生成,由使用者使用。Records are produced by producers, and consumed by consumers. 生成者将记录发送到 Kafka 代理,后者存储数据。Producers send records to Kafka brokers, which then store the data. HDInsight 群集中的每个辅助角色节点都是一个 Kafka 中转站。Each worker node in your HDInsight cluster is a Kafka broker.

主题跨代理对记录进行分区。Topics partition records across brokers. 在使用记录时,每个分区最多可使用一个使用者来实现数据并行处理。When consuming records, you can use up to one consumer per partition to achieve parallel processing of the data.

复制用于在节点之间复制分区。Replication is used to duplicate partitions across nodes. 这可以防止节点(代理)发生服务中断。This protects against node (broker) outages. 将副本组之间的单个分区指定为分区领先者。A single partition among the group of replicas is designated as the partition leader. 生成方流量将根据由 ZooKeeper 管理的状态路由到每个节点的前导者。Producer traffic is routed to the leader of each node, using the state managed by ZooKeeper.

确定方案Identify your scenario

Apache Kafka 性能体现在两个主要方面 – 吞吐量和延迟。Apache Kafka performance has two main aspects – throughput and latency. 吞吐量是指数据的最大处理速率。Throughput is the maximum rate at which data can be processed. 通常,吞吐量越高越好。Higher throughput is usually better. 延迟是指存储或检索数据所花费的时间。Latency is the time it takes for data to be stored or retrieved. 通常,延迟越低越好。Lower latency is usually better. 在吞吐量、延迟和应用基础结构的开销方面找到适当的平衡可能会有难度。Finding the right balance between throughput, latency and the cost of the application's infrastructure can be challenging. 根据追求的是高吞吐量、低延迟还是此两者,性能要求可能符合以下三种常见情况中的一种:Your performance requirements will likely match one of the following three common situations, based on whether you require high throughput, low latency, or both:

  • 高吞吐量,低延迟。High throughput, low latency. 此方案要求同时满足高吞吐量和低延迟(大约 100 毫秒)。This scenario requires both high throughput and low latency (~100 milliseconds). 服务可用性监视就是这种应用场景的一个例子。An example of this type of application is service availability monitoring.
  • 高吞吐量,高延迟。High throughput, high latency. 此方案要求满足高吞吐量(大约 1.5 GBps),但可以容许较高的延迟(小于 250 毫秒)。This scenario requires high throughput (~1.5 GBps) but can tolerate higher latency (< 250 ms). 这种应用场景的一个例子是引入遥测数据进行近实时的处理,例如安全与入侵检测应用程序。An example of this type of application is telemetry data ingestion for near real-time processes like security and intrusion detection applications.
  • 低吞吐量,低延迟。Low throughput, low latency. 此方案要求提供低延迟(小于 10 毫秒)以完成实时处理,但可以容许较低的吞吐量。This scenario requires low latency (< 10 ms) for real-time processing, but can tolerate lower throughput. 在线拼写和语法检查就是这种应用场景的一个例子。An example of this type of application is online spelling and grammar checks.

生成者配置Producer configurations

以下部分重点介绍一些用于优化 Kafka 生成者性能的最重要配置属性。The following sections will highlight some of the most important configuration properties to optimize performance of your Kafka producers. 有关所有配置属性的详细说明,请参阅有关生成者配置的 Apache Kafka 文档For a detailed explanation of all configuration properties, see Apache Kafka documentation on producer configurations.

批大小Batch size

Apache Kafka 生成者将作为一个单元发送的消息组(称为批)汇编到一起,以将其存储在单个存储分区中。Apache Kafka producers assemble groups of messages (called batches) which are sent as a unit to be stored in a single storage partition. 批大小表示在传输该组之前必须达到的字节数。Batch size means the number of bytes that must be present before that group is transmitted. 增大 batch.size 参数可以提高吞吐量,因为这可以降低网络和 IO 请求的处理开销。Increasing the batch.size parameter can increase throughput, because it reduces the processing overhead from network and IO requests. 负载较轻时,增大批大小可能会增大 Kafka 发送延迟,因为生成者需要等待某个批准备就绪。Under light load, increased batch size may increase Kafka send latency as the producer waits for a batch to be ready. 负载较重时,建议增大批大小以改善吞吐量和延迟。Under heavy load, it's recommended to increase the batch size to improve throughput and latency.

生成者所需的确认Producer required acknowledgements

生成者所需的 acks 配置确定在将某个写入请求视为已完成之前,分区领先者所需的确认数目。The producer required acks configuration determines the number of acknowledgments required by the partition leader before a write request is considered completed. 此设置会影响数据可靠性,其值为 01-1This setting affects data reliability and it takes values of 0, 1, or -1. -1 表示必须收到所有副本的确认,才能将写入请求视为已完成。The value of -1 means that an acknowledgement must be received from all replicas before the write is completed. 设置 acks = -1 能够更可靠地保证数据不会丢失,但同时也会导致延迟增大,吞吐量降低。Setting acks = -1 provides stronger guarantees against data loss, but it also results in higher latency and lower throughput. 如果应用场景要求提供较高的吞吐量,请尝试设置 acks = 0acks = 1If your application requirements demand higher throughput, try setting acks = 0 or acks = 1. 请记住,不确认所有副本可能会降低数据可靠性。Keep in mind, that not acknowledging all replicas can reduce data reliability.


可将 Kafka 生成者配置为先压缩消息,然后再将消息发送到代理。A Kafka producer can be configured to compress messages before sending them to brokers. compression.type 设置指定要使用的压缩编解码器。The compression.type setting specifies the compression codec to be used. 支持的压缩编解码器为“gzip”、“snappy”和“lz4”。Supported compression codecs are “gzip,” “snappy,” and “lz4.” 如果磁盘容量存在限制,则压缩是有利的做法,应予以考虑。Compression is beneficial and should be considered if there's a limitation on disk capacity.

gzipsnappy 这两个常用的压缩编解码器中,gzip 的压缩率更高,它可以降低磁盘用量,但代价是使 CPU 负载升高。Among the two commonly used compression codecs, gzip and snappy, gzip has a higher compression ratio, which results in lower disk usage at the cost of higher CPU load. snappy 编解码器的压缩率更低,但造成的 CPU 开销更低。The snappy codec provides less compression with less CPU overhead. 可以根据代理磁盘或生成者的 CPU 限制来决定使用哪个编解码器。You can decide which codec to use based on broker disk or producer CPU limitations. gzip 可以以比 snappy 高五倍的速率压缩数据。gzip can compress data at a rate five times higher than snappy.

使用数据压缩会增加磁盘中可存储的记录数。Using data compression will increase the number of records that can be stored on a disk. 如果生成者与代理使用的压缩格式不匹配,则数据压缩也会增大 CPU 开销。It may also increase CPU overhead in cases where there is a mismatch between the compression formats being used by the producer and the broker. 因为数据在发送之前必须经过压缩,并在处理之前经过解压缩。as the data must be compressed before sending and then decompressed before processing.

代理设置Broker settings

以下部分重点介绍一些用于优化 Kafka 代理性能的最重要设置。The following sections will highlight some of the most important settings to optimize performance of your Kafka brokers. 有关所有代理设置的详细说明,请参阅有关生成者配置的 Apache Kafka 文档For a detailed explanation of all broker settings, see Apache Kafka documentation on producer configurations.

磁盘数目Number of disks

存储磁盘的 IOPS(每秒输入/输出操作次数)和每秒读/写字节数有限制。Storage disks have limited IOPS (Input/Output Operations Per Second) and read/write bytes per second. 创建新分区时,Kafka 会将每个新分区存储在现有分区最少的磁盘上,以便在可用磁盘之间平衡分区的数目。When creating new partitions, Kafka stores each new partition on the disk with fewest existing partitions to balance them across the available disks. 尽管有存储策略进行调节,但在处理每个磁盘上的数百个分区副本时,Kafka 很容易就会使可用磁盘吞吐量达到饱和。Despite storage strategy, when processing hundreds of partition replicas on each disk, Kafka can easily saturate the available disk throughput. 此时,需要在吞吐量与成本之间进行取舍。The tradeoff here is between throughput and cost. 如果应用场景需要更大的吞吐量,请创建一个可为每个代理提供更多托管磁盘的群集。If your application requires greater throughput, create a cluster with more managed disks per broker. HDInsight 目前不支持将托管磁盘添加到正在运行的群集。HDInsight does not currently support adding managed disks to a running cluster. 有关如何配置托管磁盘数目的详细信息,请参阅为 HDInsight 上的 Apache Kafka 配置存储和可伸缩性For more information on how to configure the number of managed disks, see Configure storage and scalability for Apache Kafka on HDInsight. 了解为群集中的节点增加存储空间所造成的成本影响。Understand the cost implications of increasing storage space for the nodes in your cluster.

主题和分区的数目Number of topics and partitions

Kafka 生成者将写入主题。Kafka producers write to topics. Kafka 使用者读取主题。Kafka consumers read from topics. 主题与日志相关联,该日志是磁盘上的数据结构。A topic is associated with a log, which is a data structure on disk. Kafka 将生成者中的记录追加到主题日志的末尾。Kafka appends records from a producer(s) to the end of a topic log. 主题日志包括分散在多个文件之间的多个分区。A topic log consists of many partitions that are spread over multiple files. 而这些文件又分散在多个 Kafka 群集节点之间。These files are, in turn, spread across multiple Kafka cluster nodes. 使用者按自己的步调读取 Kafka 主题,并可在主题日志中选取其位置(偏移量)。Consumers read from Kafka topics at their cadence and can pick their position (offset) in the topic log.

每个 Kafka 分区是在系统上的一个日志文件,生成者线程可以同时写入到多个日志。Each Kafka partition is a log file on the system, and producer threads can write to multiple logs simultaneously. 同样,由于每个使用者线程从一个分区读取消息,因此也能并行处理从多个分区使用消息的操作。Similarly, since each consumer thread reads messages from one partition, consuming from multiple partitions is handled in parallel as well.

提高分区密度(每个代理的分区数)会增大与元数据操作以及每个分区领先者及其后继者之间的分区请求/响应相关的开销。Increasing the partition density (the number of partitions per broker) adds an overhead related to metadata operations and per partition request/response between the partition leader and its followers. 即使不存在流动的数据,分区副本也仍会从领先者提取数据,导致需要通过网络额外处理发送和接收请求。Even in the absence of data flowing through, partition replicas still fetch data from leaders, which results in extra processing for send and receive requests over the network.

对于 HDInsight 中的 Apache Kafka 群集 1.1 和更高版本,我们建议最多为每个代理提供 1000 个分区(包括副本)。For Apache Kafka clusters 1.1 and above in HDInsight, we recommend you to have a maximum of 1000 partitions per broker, including replicas. 增加每个代理的分区数会降低吞吐量,并可能导致主题不可用。Increasing the number of partitions per broker decreases throughput and may also cause topic unavailability. 有关 Kafka 分区支持的详细信息,请参阅有关在版本 1.1.0 中增加受支持分区数目的官方 Apache Kafka 博客文章For more information on Kafka partition support, see the official Apache Kafka blog post on the increase in the number of supported partitions in version 1.1.0. 有关修改主题的详细信息,请参阅 Apache Kafka:修改主题For details on modifying topics, see Apache Kafka: modifying topics.

副本数Number of replicas

较高的复制因子会导致分区领先者与后继者之间的请求数增加。Higher replication factor results in additional requests between the partition leader and followers. 因而,较高的复制因子会消耗更多的磁盘和 CPU 来处理额外的请求,并增大写入延迟,降低吞吐量。Consequently, a higher replication factor consumes more disk and CPU to handle additional requests, increasing write latency and decreasing throughput.

我们建议对 Azure HDInsight 中的 Kafka 至少使用 3 倍的复制因子。We recommend that you use at least 3x replication for Kafka in Azure HDInsight. 大部分 Azure 区域有三个容错域。在只有两个容错域的区域中,用户应使用 4 倍复制因子。Most Azure regions have three fault domains, but in regions with only two fault domains, users should use 4x replication.

有关复制的详细信息,请参阅 Apache Kafka:复制Apache Kafka:增大复制因子For more information on replication, see Apache Kafka: replication and Apache Kafka: increasing replication factor.

后续步骤Next steps