Azure HDInsight 业务连续性体系结构Azure HDInsight business continuity architectures

本文提供了几个你可以考虑将其用于 Azure HDInsight 的业务连续性体系结构示例。This article gives a few examples of business continuity architectures you might consider for Azure HDInsight. 对于灾难期间功能降低的容忍度是一个业务决策,每个应用程序都会有所不同。Tolerance for reduced functionality during a disaster is a business decision that varies from one application to the next. 对于某些应用程序,可以接受其不可用,或者部分可用但功能降低,或者处理操作延迟一段时间。It might be acceptable for some applications to be unavailable or to be partially available with reduced functionality or delayed processing for a period. 对于另一些应用程序,任何功能降低可能都是不可接受的。For other applications, any reduced functionality could be unacceptable.


本文中介绍的体系结构并不是详尽无遗的。The architectures presented in this article are in no way exhaustive. 在对预期的业务连续性、运营复杂性和拥有成本做出客观的判断后,你应该设计自己独特的体系结构。You should design your own unique architectures once you've made objective determinations around expected business continuity, operational complexity, and cost of ownership.

Apache Hive 和 Interactive QueryApache Hive and Interactive Query

为了在 HDInsight Hive 和 Interactive Query 群集中实现业务连续性,建议使用 Hive Replication V2Hive Replication V2 is the recommended for business continuity in HDInsight Hive and Interactive query clusters. 需要复制的独立 Hive 群集的永久部分是存储层和 Hive 元存储。The persistent sections of a standalone Hive cluster that need to be replicated are the Storage Layer and the Hive metastore. 多用户方案中具有企业安全性套餐的 Hive 群集需要 Azure Active Directory 域服务和 Ranger 元存储。Hive clusters in a multi-user scenario with Enterprise Security Package need Azure Active Directory Domain Services and Ranger Metastore.

Hive 和 Interactive Query 体系结构

Hive 基于事件的复制是在主群集与辅助群集之间配置的。Hive event-based replication is configured between the primary and secondary clusters. 这包括两个不同的阶段,即启动和增量运行:This consists of two distinct phases, bootstrapping and incremental runs:

  • “启动”会将整个 Hive 仓库(包括 Hive 元存储信息)从主群集复制到辅助群集。Bootstrapping replicates the entire Hive warehouse including the Hive metastore information from primary to secondary.

  • 增量运行在主群集上自动进行,在增量运行期间生成的事件会在辅助群集上回放。Incremental runs are automated on the primary cluster and the events generated during the incremental runs are played back on the secondary cluster. 辅助群集的事件会跟上从主群集生成的事件的进度,确保辅助群集的事件在复制运行后与主群集的事件一致。The secondary cluster catches up with the events generated from the primary cluster, ensuring that the secondary cluster is consistent with the primary cluster's events after the replication run.

只有在复制时才需要使用辅助群集来运行分布式复制 DistCp,但是存储和元存储需要是永久性的。The secondary cluster is needed only at the time of replication to run distributed copy, DistCp, but the storage and metastores need to be persistent. 你可以选择在复制之前按需启动脚本化的辅助群集,在其上运行复制脚本,然后在成功复制后将其销毁。You could choose to spin up a scripted secondary cluster on-demand before replication, run the replication script on it, and then tear it down after successful replication.

辅助群集通常是只读的。The secondary cluster is usually read-only. 你可以将辅助群集设置为可读写的,但这会增加额外的复杂性,涉及将更改从辅助群集复制到主群集。You can make the secondary cluster read-write, but this adds additional complexity that involves replicating the changes from the secondary cluster to the primary cluster.

Hive 基于事件的复制 RPO 和 RTOHive event-based replication RPO & RTO

  • RPO:数据丢失限制为从主群集到辅助群集的最后一个成功的增量复制事件。RPO: Data loss is limited to the last successful incremental replication event from primary to secondary.

  • RTO:故障之后恢复与辅助群集进行的上游和下游事务所需的时间。RTO: The time between the failure and the resumption of upstream and downstream transactions with the secondary.

Apache Hive 和 Interactive Query 体系结构Apache Hive and Interactive Query architectures

与按需辅助群集配合使用的 Hive 主动主群集Hive active primary with on-demand secondary

在“与按需辅助群集配合使用的主动主群集”体系结构中,在正常操作期间,应用程序会写入到主动主区域,不会在辅助区域中预配任何群集。In an active primary with on-demand secondary architecture, applications write to the active primary region while no cluster is provisioned in the secondary region during normal operations. 辅助区域中的 SQL 元存储和存储是永久性的,你只在计划的 Hive 复制运行之前针对 HDInsight 群集按需编写脚本并进行部署。SQL Metastore and Storage in the secondary region are persistent, while the HDInsight cluster is scripted and deployed on-demand only before the scheduled Hive replication runs.


与备用辅助群集配合使用的 Hive 主动主群集Hive active primary with standby secondary

在“与备用辅助群集配合使用的主动主群集”体系结构中,在正常操作期间,应用程序会将内容写入到主动主区域,而备用纵向缩减辅助群集则以只读模式运行。In an active primary with standby secondary , applications write to the active primary region while a standby scaled down secondary cluster in read-only mode runs during normal operations. 在正常操作期间,你可以选择将特定于区域的读取操作负荷转移到辅助区域。During normal operations, you could choose to offload region specific read operations to secondary.


Apache SparkApache Spark

Spark 工作负荷可能涉及也可能不涉及 Hive 组件。Spark workloads may or may not involve a Hive component. 为了使 Spark SQL 工作负荷能够从 Hive 读取和写入数据,HDInsight Spark 群集共享来自同一区域中的 Hive/Interactive Query 群集的 Hive 自定义元存储。To enable Spark SQL workloads to read and write data from Hive, HDInsight Spark clusters share Hive custom metastores from Hive/Interactive query clusters in the same region. 在这种情况下,Spark 工作负荷的跨区域复制还必须伴随着 Hive 元存储和存储的复制。In such scenarios, cross region replication of Spark workloads must also accompany the replication of Hive metastores and storage. 本部分的故障转移方案适用于以下两种情况:The failover scenarios in this section apply to both:

对于 Spark 以独立模式工作的场景,需要使用 Azure 数据工厂的 DistCP 定期将特选数据和存储的 Spark Jar(适用于 Livy 作业)从主区域复制到辅助区域。For scenarios where Spark works in standalone mode, curated data and stored Spark Jars (for Livy jobs) need to be replicated from the primary region to the secondary region on a regular basis using Azure Data Factory's DistCP.

建议你使用版本控制系统来存储 Spark 笔记本和库,这样可以轻松将它们部署在主群集或辅助群集上。We recommend that you use version control systems to store Spark notebooks and libraries where they can easily be deployed on primary or secondary clusters. 请确保基于笔记本的解决方案和不基于笔记本的解决方案都已准备就绪,以在主工作区或辅助工作区中加载正确的数据装载内容。Ensure that notebook based and non-notebook based solutions are prepared to load the correct data mounts in the primary or secondary workspace.

如果有特定于客户的库,但这些库超出了 HDInsight 本身的提供能力,则必须对其进行跟踪,并定期将其加载到备用辅助群集。If there are customer-specific libraries which are beyond what HDInsight provides natively, they must be tracked and periodically loaded into the standby secondary cluster.

Apache Spark 复制 RPO 和 RTOApache Spark replication RPO & RTO

  • RPO:数据丢失限制为从主群集到辅助群集的最后一个成功的增量复制(Spark 和 Hive)。RPO: The data loss is limited to the last successful incremental replication (Spark and Hive) from primary to secondary.

  • RTO:故障之后恢复与辅助群集进行的上游和下游事务所需的时间。RTO: The time between the failure and the resumption of upstream and downstream transactions with the secondary.

Apache Spark 体系结构Apache Spark architectures

与按需辅助群集配合使用的 Spark 主动主群集Spark active primary with on-demand secondary

在正常操作期间,应用程序在主区域中的 Spark 和 Hive 群集上进行读取和写入,不在辅助区域中预配任何群集。Applications read and write to Spark and Hive Clusters in the primary region while no clusters are provisioned in the secondary region during normal operations. SQL 元存储、Hive 存储和 Spark 存储在辅助区域中是永久性的。SQL Metastore, Hive Storage, and Spark Storage are persistent in the secondary region. 针对 Spark 和 Hive 群集按需编写脚本并进行部署。The Spark and Hive clusters are scripted and deployed on-demand. Hive 复制用来复制 Hive 存储和 Hive 元存储,而 Azure 数据工厂的 DistCP 可用来复制独立的 Spark 存储。Hive replication is used to replicate Hive Storage and Hive metastores while Azure Data Factory's DistCP can be used to copy standalone Spark storage. 由于进行依赖关系 DistCp 计算,每次运行 Hive 复制之前都需要部署 Hive 群集。Hive clusters need to deploy before every Hive replication run because of the dependency DistCp compute.

Apache Spark“与按需辅助群集配合使用的主动主群集”体系结构

与备用辅助群集配合使用的 Spark 主动主群集Spark active primary with standby secondary

在正常操作期间,应用程序在主区域中的 Spark 和 Hive 群集上进行读取和写入,而辅助区域中的备用横向缩减 Hive 和 Spark 群集以只读模式运行。Applications read and write to Spark and Hive clusters in the primary region while standby scaled-down Hive and Spark clusters in read-only mode run in secondary region during normal operations. 在正常操作期间,你可以选择将特定于区域的 Hive 和 Spark 读取操作负荷转移到辅助区域。During normal operations, you could choose to offload region specific Hive and Spark read operations to secondary.

与备用辅助群集配合使用的 Apache Spark 主动主群集

Apache HBaseApache HBase

HBase 导出和 HBase 复制是在 HDInsight HBase 群集之间实现业务连续性的常用方法。HBase Export and HBase Replication are common ways of enabling business continuity between HDInsight HBase clusters.

HBase 导出是一个批量复制过程,它使用 HBase 导出实用工具将表从主 HBase 群集导出到其基础 Azure Data Lake Storage Gen2 存储。HBase Export is a batch replication process that uses the HBase Export Utility to export tables from the primary HBase cluster to its underlying Azure Data Lake Storage Gen 2 storage. 然后,可以从辅助 HBase 群集访问导出的数据,并将其导入到必须永久保存在辅助群集中的表中。The exported data can then be accessed from the secondary HBase cluster and imported into tables which must preexist in the secondary. 尽管 HBase 导出提供表级粒度,但在增量更新情况下,将由导出自动化引擎控制每次运行时要包括的增量行范围。While HBase Export does offer table level granularity, in incremental update situations, the export automation engine controls the range of incremental rows to include in each run. 有关详细信息,请参阅 HDInsight HBase 备份和复制For more information, see HDInsight HBase Backup and Replication.

HBase 复制以完全自动化方式在 HBase 群集之间使用近实时复制。HBase Replication uses near real-time replication between HBase clusters in a fully automated manner. 复制在表级别进行。Replication is done at the table level. 可以将所有表或特定表作为复制的目标。Either all tables or specific tables can be targeted for replication. HBase 复制是最终一致的,这意味着最近对主区域中的表的编辑可能不会立即可用于所有辅助区域。HBase replication is eventually consistent, meaning that recent edits to a table in the primary region may not be available to all the secondaries immediately. 辅助区域保证最终会变得与主区域一致。Secondaries are guaranteed to eventually become consistent with the primary. 在以下情况下,可以在两个或多个 HDInsight HBase 群集之间设置 HBase 复制:HBase replication can be set up between two or more HDInsight HBase clusters if:

  • 主群集和辅助群集位于同一虚拟网络中。Primary and secondary are in the same virtual network.
  • 主群集和辅助群集位于同一区域中对等互连的不同 VNet 中。Primary and secondary are in different peered VNets in the same region.
  • 主群集和辅助群集位于不同区域中对等互连的不同 VNet 中。Primary and secondary are in different peered VNets in different regions.

有关详细信息,请参阅在 Azure 虚拟网络中设置 Apache HBase 群集复制For more information, see Set up Apache HBase cluster replication in Azure virtual networks.

还有一些其他方式可用来执行 HBase 群集的备份,例如复制 hbase 文件夹复制表快照There are a few other ways of performing backups of HBase clusters like copying the hbase folder, copy tables and Snapshots.


HBase 导出HBase Export

  • RPO:数据丢失限制为辅助群集从主群集执行的最后一次成功的批量增量导入。RPO: Data Loss is limited to the last successful batch incremental import by the secondary from the primary.
  • RTO:主群集上发生故障与辅助群集上的 I/O 操作恢复之间的时间间隔。RTO: The time between failure of the primary and resumption of I/O operations on the secondary.

HBase 复制HBase Replication

  • RPO:数据丢失限制为在辅助群集上收到的最后一次 WalEdit 发货。RPO: Data Loss is limited to the last WalEdit Shipment received at the secondary.
  • RTO:主群集上发生故障与辅助群集上的 I/O 操作恢复之间的时间间隔。RTO: The time between failure of the primary and resumption of I/O operations on the secondary.

HBase 体系结构HBase architectures

可以采用三种模式设置 HBase 复制:“领导者-追随者”、“领导者-领导者”和“循环”。HBase replication can be set up in three modes: Leader-Follower, Leader-Leader and Cyclic.

HBase 复制:“领导者-追随者”模型HBase Replication: Leader – Follower model

在此跨区域设置中,复制从主区域到辅助区域单向进行。In this cross-region set up, replication is unidirectional from the primary region to the secondary region. 可以在主区域中标识要进行单向复制的所有表或特定表。Either all tables or specific tables on the primary can be identified for unidirectional replication. 在正常操作期间,辅助群集可用于处理其自己的区域中的读取请求。During normal operations, the secondary cluster can be used to serve read requests in its own region.

辅助群集作为普通的 HBase 群集运行,它可以承载自己的表,并且可以处理区域应用程序的读写操作。The secondary cluster operates as a normal HBase cluster that can host its own tables and can serve reads and writes from regional applications. 但是,在已复制的表或辅助群集原有的表上进行的写入不会复制回主群集。However, writes on the replicated tables or tables native to secondary are not replicated back to the primary.


HBase 复制:“领导者-领导者”模型HBase Replication: Leader – Leader model

此跨区域设置非常类似于单向设置,只不过复制在主区域与辅助区域之间双向进行。This cross-region set up is very similar to the unidirectional set up except that replication happens bidirectionally between the primary region and the secondary region. 应用程序可以在读写模式下使用这两个群集,并在它们之间异步交换更新。Applications can use both clusters in read–write modes and updates are exchanges asynchronously between them.


HBase 复制:“多区域”或“循环”HBase Replication: Multi-Region or Cyclic

“多区域/循环”复制模型是 HBase 复制的扩展,可用于创建包含多个应用程序的全局冗余 HBase 体系结构,这些应用程序可在特定于区域的 HBase 群集中进行读取和写入。The Multi-Region/Cyclic replication model is an extension of HBase Replication and could be used to create a globally redundant HBase architecture with multiple applications which read and write to region specific HBase clusters. 可以根据业务要求按不同的“领导者/领导者”或“领导者/追随者”组合来设置群集。The clusters can be set up in various combinations of Leader/Leader or Leader/Follower depending on business requirements.

HBase 循环模型

Apache KafkaApache Kafka

为了实现跨区域可用性,HDInsight 4.0 支持 Kafka MirrorMaker,后者可用于在另一区域中维护主 Kafka 群集的辅助副本。To enable cross region availability HDInsight 4.0 supports Kafka MirrorMaker which can be used to maintain a secondary replica of the primary Kafka cluster in a different region. MirrorMaker 充当高级“使用者-生成者”对,使用主群集中某个特定主题提供的内容,将内容生成到辅助群集中名称相同的主题。MirrorMaker acts as a high-level consumer-producer pair, consumes from a specific topic in the primary cluster and produces to a topic with the same name in the secondary. 使用 MirrorMaker 进行跨群集复制以实现高可用性灾难恢复的假设条件是生成者和使用者需要故障转移到副本群集。Cross cluster replication for high availability disaster recovery using MirrorMaker comes with the assumption that Producers and Consumers need to fail over to the replica cluster. 有关详细信息,请参阅使用 MirrorMaker 通过 Kafka on HDInsight 复制 Apache Kafka 主题For more information, see Use MirrorMaker to replicate Apache Kafka topics with Kafka on HDInsight

MirrorMaker 主题复制可能会在源主题与副本主题之间导致不同的偏移量,具体取决于复制开始时的主题生存期。Depending on the topic lifetime when replication started, MirrorMaker topic replication can lead to different offsets between source and replica topics. HDInsight Kafka 群集还支持主题分区复制,这是在单个群集级别的高可用性功能。HDInsight Kafka clusters also support topic partition replication which is a high availability feature at the individual cluster level.

Apache Kafka 复制

Apache Kafka 体系结构Apache Kafka architectures

Kafka 复制:主动 – 被动Kafka Replication: Active – Passive

主动-被动设置启用从主动群集到被动群集的异步单向镜像。Active-Passive setup enables asynchronous unidirectional mirroring from Active to Passive. 生成者和使用者需要知道主动和被动群集的存在,并且必须准备好在主动群集发生故障时故障转移到被动群集。Producers and Consumers need to be aware of the existence of an Active and Passive cluster and must be ready to fail over to the Passive in case the Active fails. 下面是“主动-被动”设置的一些优点和缺点。Below are some advantages and disadvantages of Active-Passive setup.


  • 群集之间的网络延迟不会影响主动群集的性能。Network latency between clusters does not affect the Active cluster's performance.
  • 单向复制的简单性。Simplicity of unidirectional replication.


  • 被动群集可能一直未充分利用。The Passive cluster may remain underutilized.
  • 在应用程序生成者和使用者中纳入故障转移感知的设计复杂性。Design complexity in incorporating failover awareness in application producers and consumers.
  • 在主动群集故障期间可能会丢失数据。Possible data loss during failure of the Active cluster.
  • 主动与被动群集的主题之间的最终一致性。Eventual consistency between topics between Active and Passive clusters.
  • 故障回复到主群集可能会导致主题中的消息不一致。Failbacks to Primary may lead to message inconsistency in topics.

Apache Kafka 主动 - 被动模型

Kafka 复制:主动 – 主动Kafka Replication: Active – Active

“主动-主动”设置涉及两个进行了区域分离的、VNet 对等互连的 HDInsight Kafka 群集,采用 MirrorMaker 进行双向异步复制。Active-Active set up involves two regionally separated, VNet peered HDInsight Kafka clusters with bidirectional asynchronous replication with MirrorMaker. 在此设计中,主区域中的使用者使用的消息也可以供辅助区域中的使用者使用,反之亦然。In this design, messages consumed by the consumers in the primary are also made available to consumers in secondary and vice versa. 下面是“主动-主动”设置的一些优点和缺点。Below are some advantages and disadvantages of Active-Active setup.


  • 由于其状态是重复的,故障转移和故障回复更容易执行。Because of their duplicated state, failovers and failbacks are easier to execute.


  • 设置、管理和监视比“主动-被动”设置更复杂。Set up, management, and monitoring is more complex than Active-Passive.
  • 循环复制的问题需要解决。The problem of circular replication needs to addressed.
  • 双向复制会导致更高的区域数据出口费用。Bidirectional replication leads to higher regional data egress costs.

Apache Kafka “主动-主动”模型

HDInsight 企业安全性套餐HDInsight Enterprise Security Package

此设置用于在主群集和辅助群集以及 Azure AD DS 副本集中启用多用户功能,以确保用户可以向这两个群集进行身份验证。This set up is used to enable multi-user functionality in both primary and secondary, as well as Azure AD DS replica sets to ensure that users can authenticate to both clusters. 在正常操作期间,需要在辅助群集中设置 Ranger 策略,以确保用户只能进行读取操作。During normal operations, Ranger policies need to be set up in the Secondary to ensure that users are restricted to Read operations. 下面的体系结构说明了启用了 ESP 的 Hive“主动主群集 – 备用辅助群集”设置的情况。The below architecture explains how an ESP enabled Hive Active Primary – Standby Secondary set up might look.

Ranger 元存储复制:Ranger Metastore replication:

Ranger 元存储用于永久存储和提供 Ranger 策略来控制数据授权。Ranger Metastore is used to persistently store and serve Ranger policies for controlling data authorization. 建议在主群集和辅助群集中保留独立的 Ranger 策略,并保留辅助群集作为只读副本。We recommend that you maintain independent Ranger policies in primary and secondary and maintain the secondary as a read replica.

如果要求在主群集与辅助群集之间保持 Ranger 策略同步,请使用 Ranger 导入/导出定期备份 Ranger 策略并将其从主群集导入到辅助群集。If the requirement is to keep Ranger policies in sync between primary and secondary, use Ranger Import/Export to periodically back-up and import Ranger policies from primary to secondary.

如果在主群集与辅助群集之间复制 Ranger 策略,则可能会导致辅助群集变为支持写入的群集,这可能会导致在辅助群集上出现意外的写入,导致数据不一致。Replicating Ranger policies between primary and secondary can cause the secondary to become write-enabled, which can lead to inadvertent writes on the secondary leading to data inconsistencies.

HDInsight 企业安全性套餐体系结构

后续步骤Next steps

若要详细了解本文中所述的项,请参阅:To learn more about the items discussed in this article, see: