将 Kafka MirrorMaker 与适用于 Apache Kafka 的事件中心配合使用Use Kafka MirrorMaker with Event Hubs for Apache Kafka
本教程介绍如何使用 Kafka MirrorMaker 在事件中心镜像 Kafka 中转站。This tutorial shows how to mirror a Kafka broker in an event hub using Kafka MirrorMaker.
备注
本文包含对术语“白名单”的引用,Microsoft 不再使用该术语。This article contains references to the term whitelist, a term that Microsoft no longer uses. 在从软件中删除该术语后,我们会将其从本文中删除。When the term is removed from the software, we'll remove it from this article.
在本教程中,你将了解如何执行以下操作:In this tutorial, you learn how to:
- 创建事件中心命名空间Create an Event Hubs namespace
- 克隆示例项目Clone the example project
- 设置 Kafka 群集Set up a Kafka cluster
- 配置 Kafka MirrorMakerConfigure Kafka MirrorMaker
- 运行 Kafka MirrorMakerRun Kafka MirrorMaker
简介Introduction
新式云缩放应用的一个主要考虑因素是能够在不中断服务的情况下更新、改进和更改基础结构。One major consideration for modern cloud scale apps is the ability to update, improve, and change infrastructure without interrupting service. 本教程介绍事件中心和 Kafka MirrorMaker 如何通过在事件中心服务中“镜像”Kafka 输入流将现有 Kafka 管道集成到 Azure 中。This tutorial shows how an event hub and Kafka MirrorMaker can integrate an existing Kafka pipeline into Azure by "mirroring" the Kafka input stream in the Event Hubs service.
通过 Azure 事件中心 Kafka 终结点,用户可以使用 Kafka 协议(即 Kafka 客户端)连接到 Azure 事件中心。An Azure Event Hubs Kafka endpoint enables you to connect to Azure Event Hubs using the Kafka protocol (that is, Kafka clients). 通过对 Kafka 应用程序进行少量更改,可以连接到 Azure 事件中心并利用 Azure 生态系统的好处。By making minimal changes to a Kafka application, you can connect to Azure Event Hubs and enjoy the benefits of the Azure ecosystem. 事件中心当前支持 Kafka 1.0 及更高版本。Event Hubs currently supports Kafka versions 1.0 and later.
先决条件Prerequisites
若要完成本教程,请确保做好以下准备:To complete this tutorial, make sure you have:
- 通读用于 Apache Kafka 的事件中心一文。Read through the Event Hubs for Apache Kafka article.
- Azure 订阅。An Azure subscription. 如果没有 Azure 订阅,请在开始之前创建一个试用版订阅。If you do not have one, create a Trial Subscription before you begin.
- Java 开发工具包 (JDK) 1.7+Java Development Kit (JDK) 1.7+
- 在 Ubuntu 上运行
apt-get install default-jdk
,以便安装 JDK。On Ubuntu, runapt-get install default-jdk
to install the JDK. - 请确保设置 JAVA_HOME 环境变量,使之指向在其中安装了 JDK 的文件夹。Be sure to set the JAVA_HOME environment variable to point to the folder where the JDK is installed.
- 在 Ubuntu 上运行
- 下载和安装 Maven 二进制存档Download and install a Maven binary archive
- 在 Ubuntu 上,可以通过运行
apt-get install maven
来安装 Maven。On Ubuntu, you can runapt-get install maven
to install Maven.
- 在 Ubuntu 上,可以通过运行
- GitGit
- 在 Ubuntu 上,可以通过运行
sudo apt-get install git
来安装 Git。On Ubuntu, you can runsudo apt-get install git
to install Git.
- 在 Ubuntu 上,可以通过运行
创建事件中心命名空间Create an Event Hubs namespace
要从事件中心服务进行发送和接收,需要使用事件中心命名空间。An Event Hubs namespace is required to send and receive from any Event Hubs service. 有关创建命名空间和事件中心的说明,请参阅创建事件中心。See Creating an event hub for instructions to create a namespace and an event hub. 请确保复制事件中心连接字符串,以供将来使用。Make sure to copy the Event Hubs connection string for later use.
克隆示例项目Clone the example project
获得事件中心连接字符串后,即可克隆适用于 Kafka 的 Azure 事件中心存储库并导航到 mirror-maker
子文件夹:Now that you have an Event Hubs connection string, clone the Azure Event Hubs for Kafka repository and navigate to the mirror-maker
subfolder:
git clone https://github.com/Azure/azure-event-hubs-for-kafka.git
cd azure-event-hubs-for-kafka/tutorials/mirror-maker
设置 Kafka 群集Set up a Kafka cluster
使用 Kafka 快速入门指南设置具有所需设置的群集(或使用现有 Kafka 群集)。Use the Kafka quickstart guide to set up a cluster with the desired settings (or use an existing Kafka cluster).
配置 Kafka MirrorMakerConfigure Kafka MirrorMaker
Kafka MirrorMaker 支持流“镜像”。Kafka MirrorMaker enables the "mirroring" of a stream. 鉴于源和目标 Kafka 群集,MirrorMaker 可以确保发送到源群集的任何消息会由源和目标群集接收。Given source and destination Kafka clusters, MirrorMaker ensures any messages sent to the source cluster are received by both the source and destination clusters. 此示例演示如何使用目标事件中心镜像源 Kafka 群集。This example shows how to mirror a source Kafka cluster with a destination event hub. 此方案可用于从现有 Kafka 管道将数据发送到事件中心,而不会中断数据流。This scenario can be used to send data from an existing Kafka pipeline to Event Hubs without interrupting the flow of data.
有关 Kafka MirrorMaker 的更多详细信息,请参阅 Kafka 镜像/MirrorMaker 指南。For more detailed information on Kafka MirrorMaker, see the Kafka Mirroring/MirrorMaker guide.
若要配置 Kafka MirrorMaker,请为其提供一个 Kafka 群集作为其使用者/源,并为其提供一个事件中心作为其生成者/目标。To configure Kafka MirrorMaker, give it a Kafka cluster as its consumer/source and an event hub as its producer/destination.
使用者配置Consumer configuration
更新使用者配置文件 source-kafka.config
,它可告知 MirrorMaker 源 Kafka 群集的属性。Update the consumer configuration file source-kafka.config
, which tells MirrorMaker the properties of the source Kafka cluster.
source-kafka.configsource-kafka.config
bootstrap.servers={SOURCE.KAFKA.IP.ADDRESS1}:{SOURCE.KAFKA.PORT1},{SOURCE.KAFKA.IP.ADDRESS2}:{SOURCE.KAFKA.PORT2},etc
group.id=example-mirrormaker-group
exclude.internal.topics=true
client.id=mirror_maker_consumer
生成者配置Producer configuration
现在更新生成者配置文件 mirror-eventhub.config
,它可要求 MirrorMaker 将重复(或“已镜像”)数据发送到事件中心服务。Now update the producer configuration file mirror-eventhub.config
, which tells MirrorMaker to send the duplicated (or "mirrored") data to the Event Hubs service. 具体而言,更改 bootstrap.servers
和 sasl.jaas.config
以指向事件中心 Kafka 终结点。Specifically, change bootstrap.servers
and sasl.jaas.config
to point to your Event Hubs Kafka endpoint. 事件中心服务要求安全的 (SASL) 通信,这可通过在以下配置中设置最后三个属性实现:The Event Hubs service requires secure (SASL) communication, which is achieved by setting the last three properties in the following configuration:
mirror-eventhub.configmirror-eventhub.config
bootstrap.servers={YOUR.EVENTHUBS.FQDN}:9093
client.id=mirror_maker_producer
#Required for Event Hubs
sasl.mechanism=PLAIN
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="{YOUR.EVENTHUBS.CONNECTION.STRING}";
重要
将 {YOUR.EVENTHUBS.CONNECTION.STRING}
替换为事件中心命名空间的连接字符串。Replace {YOUR.EVENTHUBS.CONNECTION.STRING}
with the connection string for your Event Hubs namespace. 有关获取连接字符串的说明,请参阅获取事件中心连接字符串。For instructions on getting the connection string, see Get an Event Hubs connection string. 下面是一个配置示例:sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://mynamespace.servicebus.chinacloudapi.cn/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXXXXX";
Here's an example configuration: sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://mynamespace.servicebus.chinacloudapi.cn/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXXXXX";
运行 Kafka MirrorMakerRun Kafka MirrorMaker
使用新更新的配置文件从 Kafka 根目录中运行 Kafka MirrorMaker 脚本。Run the Kafka MirrorMaker script from the root Kafka directory using the newly updated configuration files. 请务必将配置文件复制到 Kafka 根目录,或在下面的命令中更新其路径。Make sure to either copy the config files to the root Kafka directory, or update their paths in the following command.
bin/kafka-mirror-maker.sh --consumer.config source-kafka.config --num.streams 1 --producer.config mirror-eventhub.config --whitelist=".*"
若要验证事件是否到达事件中心,请参阅 Azure 门户中的入口统计信息,或针对事件中心运行使用者。To verify that events are reaching the event hub, see the ingress statistics in the Azure portal, or run a consumer against the event hub.
运行 MirrorMaker 后,发送给源 Kafka 群集的任何事件都将由 Kafka 群集和已镜像的事件中心接收。With MirrorMaker running, any events sent to the source Kafka cluster are received by both the Kafka cluster and the mirrored event hub. 通过使用 MirrorMaker 和事件中心 Kafka 终结点,可以将现有的 Kafka 管道迁移到托管的 Azure 事件中心服务,而无需更改现有的群集或中断任何正在进行的数据流。By using MirrorMaker and an Event Hubs Kafka endpoint, you can migrate an existing Kafka pipeline to the managed Azure Event Hubs service without changing the existing cluster or interrupting any ongoing data flow.
示例Samples
请参阅 GitHub 上的以下示例:See the following samples on GitHub:
- GitHub 上此教程的示例代码Sample code for this tutorial on GitHub
- 在 Azure 容器实例上运行的 Azure 事件中心 Kafka MirrorMakerAzure Event Hubs Kafka MirrorMaker running on an Azure Container Instance
后续步骤Next steps
若要详细了解适用于 Kafka 的事件中心,请参阅以下文章:To learn more about Event Hubs for Kafka, see the following articles:
- 将 Apache Spark 连接到事件中心Connect Apache Spark to an event hub
- 将 Apache Flink 连接到事件中心Connect Apache Flink to an event hub
- 将 Kafka Connect 与事件中心集成Integrate Kafka Connect with an event hub
- 了解 GitHub 上的示例Explore samples on our GitHub
- 将 Akka Streams 连接到事件中心Connect Akka Streams to an event hub
- 针对 Azure 事件中心的 Apache Kafka 开发人员指南Apache Kafka developer guide for Azure Event Hubs