使用 Azure SQL 数据库弹性池的应用程序的灾难恢复策略Disaster recovery strategies for applications using Azure SQL Database elastic pools

适用于: Azure SQL 数据库

Azure SQL 数据库具备许多功能,可在发生灾难性事件时保证应用程序的业务连续性。Azure SQL Database provides several capabilities to provide for the business continuity of your application when catastrophic incidents occur. 弹性池和单一数据库支持相同类型的灾难恢复 (DR) 功能。Elastic pools and single databases support the same kind of disaster recovery (DR) capabilities. 本文介绍了几种针对利用这些 Azure SQL 数据库业务连续性功能的弹性池的 DR 策略。This article describes several DR strategies for elastic pools that leverage these Azure SQL Database business continuity features.

本文使用以下规范 SaaS ISV 应用程序模式:This article uses the following canonical SaaS ISV application pattern:

基于现代云的 Web 应用程序为每位最终用户都预配了一个数据库。A modern cloud-based web application provisions one database for each end user. ISV 拥有大量客户,因此会使用多个数据库,称为租户数据库。The ISV has many customers and therefore uses many databases, known as tenant databases. 由于租户数据库的活动模式通常不可预测,因此 ISV 通常使用弹性池,以使数据库成本在很长时间段内具有高度可预测性。Because the tenant databases typically have unpredictable activity patterns, the ISV uses an elastic pool to make the database cost very predictable over extended periods of time. 弹性池还简化了用户活动达到高峰时的性能管理。The elastic pool also simplifies the performance management when the user activity spikes. 除租户数据库之外,应用程序还使用多个数据库来管理用户配置文件、安全性、收集使用模式等。各个租户的可用性整体上不会影响应用程序的可用性。In addition to the tenant databases the application also uses several databases to manage user profiles, security, collect usage patterns etc. Availability of the individual tenants does not impact the application’s availability as whole. 但是,管理数据库的可用性和性能对应用程序的功能至关重要,如果管理数据库处于脱机状态,则整个应用程序也同样处于脱机状态。However, the availability and performance of management databases is critical for the application’s function and if the management databases are offline the entire application is offline.

本文将通过各种方案(从节约成本的启动应用程序到具有严格可用性要求的应用程序)讨论 DR 策略。This article discusses DR strategies covering a range of scenarios from cost sensitive startup applications to ones with stringent availability requirements.

方案 1.Scenario 1. 关注成本的创业公司Cost sensitive startup

我们是一家创业公司,非常关注成本问题。I am a startup business and am extremely cost sensitive. 我希望简化应用程序的部署和管理,并可为各个客户提供有限的 SLA。I want to simplify deployment and management of the application and I can have a limited SLA for individual customers. 但是我希望确保应用程序整体不会脱机。But I want to ensure the application as a whole is never offline.

为满足简单性要求,请将所有租户数据库部署到所选的 Azure 区域中的一个弹性池中,同时将管理数据库部署为异地复制的单一数据库。To satisfy the simplicity requirement, deploy all tenant databases into one elastic pool in the Azure region of your choice and deploy management databases as geo-replicated single databases. 对于租户的灾难恢复,请使用异地还原,无需额外付费。For the disaster recovery of tenants, use geo-restore, which comes at no additional cost. 为确保管理数据库的可用性,请使用自动故障转移组将数据库异地复制到另一区域(步骤 1)。To ensure the availability of the management databases, geo-replicate them to another region using an auto-failover group (step 1). 此方案中的灾难恢复配置持续成本等于辅助数据库的总成本。The ongoing cost of the disaster recovery configuration in this scenario is equal to the total cost of the secondary databases. 下图说明了此配置。This configuration is illustrated on the next diagram.

图 1

如果主要区域发生中断,可以使用下图中的恢复步骤恢复应用程序的在线状态。If an outage occurs in the primary region, the recovery steps to bring your application online are illustrated by the next diagram.

  • 故障转移组启动管理数据库向 DR 区域的自动故障转移。The failover group initiates automatic failover of the management database to the DR region. 应用程序将自动重新连接到新的主要区域,将在 DR 区域中创建所有新帐户和租户数据库。The application is automatically reconnected to the new primary and all new accounts and tenant databases are created in the DR region. 现有客户的数据将暂时不可用。The existing customers see their data temporarily unavailable.
  • 创建与原始池具有相同配置的弹性池 (2)。Create the elastic pool with the same configuration as the original pool (2).
  • 使用异地还原来创建租户数据库的副本 (3)。Use geo-restore to create copies of the tenant databases (3). 可以考虑通过最终用户连接或使用其他应用程序的特定优先级方案来触发单个还原。You can consider triggering the individual restores by the end-user connections or use some other application-specific priority scheme.

此时,应用程序便已在 DR 区域中恢复在线状态,但某些客户会在访问数据时遇到延迟。At this point your application is back online in the DR region, but some customers experience delay when accessing their data.

图 2

如果中断是暂时的,Azure 可能会在 DR 区域中先恢复主要区域,然后再完成所有数据库还原。If the outage was temporary, it is possible that the primary region is recovered by Azure before all the database restores are complete in the DR region. 在这种情况下,请安排将应用程序移回主要区域。In this case, orchestrate moving the application back to the primary region. 此过程会采用下图所示的步骤。The process takes the steps illustrated on the next diagram.

  • 取消所有未完成的异地还原请求。Cancel all outstanding geo-restore requests.
  • 将管理数据库故障转移到主要区域 (5)。Fail over the management databases to the primary region (5). 区域恢复后,旧的主数据库便已自动成为辅助数据库。After the region's recovery, the old primaries have automatically become secondaries. 现在将再次切换角色。Now they switch roles again.
  • 更改应用程序的连接字符串,使其重新指向主要区域。Change the application's connection string to point back to the primary region. 现在会在主要区域中创建所有新帐户和租户数据库。Now all new accounts and tenant databases are created in the primary region. 某些现有客户的数据将暂时不可用。Some existing customers see their data temporarily unavailable.
  • 将 DR 池中的所有数据库设置为只读,确保无法在 DR 区域中对其进行修改 (6)。Set all databases in the DR pool to read-only to ensure they cannot be modified in the DR region (6).
  • 对于自恢复以来已更改的 DR 池中的每个数据库,重命名或删除主池中的相应数据库 (7)。For each database in the DR pool that has changed since the recovery, rename or delete the corresponding databases in the primary pool (7).
  • 将 DR 池中已更新的数据库复制到主池 (8)。Copy the updated databases from the DR pool to the primary pool (8).
  • 删除 DR 池 (9)Delete the DR pool (9)

此时,主要区域中的应用程序将处于在线状态,且主池中的所有租户数据库都可用。At this point your application is online in the primary region with all tenant databases available in the primary pool.

图 3

优势Benefit

此策略的主要优点在于数据层冗余的持续低成本。The key benefit of this strategy is low ongoing cost for data tier redundancy. Azure SQL 数据库会自动备份数据库,且无需支付任何额外成本、无需应用程序重写。Azure SQL Database automatically backs up databases with no application rewrite at no additional cost. 仅在还原弹性数据库时才产生成本。The cost is incurred only when the elastic databases are restored.

权衡Trade-off

权衡是指所有租户数据库的完全恢复将花费很长时间。The trade-off is that the complete recovery of all tenant databases takes significant time. 具体时间长短取决于在 DR 区域中启动还原的总数和租户数据库的总体大小。The length of time depends on the total number of restores you initiate in the DR region and overall size of the tenant databases. 即使让某些租户的还原优先于其他租户,也将与相同区域中启动的所有其他还原争用区域,因为服务将进行仲裁和限制以使对现有客户的数据库的总体影响降到最低。Even if you prioritize some tenants' restores over others, you are competing with all the other restores that are initiated in the same region as the service arbitrates and throttles to minimize the overall impact on the existing customers' databases. 此外,在 DR 区域中创建新的弹性池之前,无法启动租户数据库恢复。In addition, the recovery of the tenant databases cannot start until the new elastic pool in the DR region is created.

方案 2.Scenario 2. 具有分层服务的成熟应用程序Mature application with tiered service

我在使用一款具有分层服务功能并针对试用客户和付费客户提供不同 SLA 的成熟 SaaS 应用程序。I am a mature SaaS application with tiered service offers and different SLAs for trial customers and for paying customers. 对于试用客户,我不得不尽可能降低成本。For the trial customers, I have to reduce the cost as much as possible. 试用客户可能会遇到故障时间,但是我希望降低其发生的可能性。Trial customers can take downtime but I want to reduce its likelihood. 对于付费客户,任何故障时间都可能导致客户不再续订。For the paying customers, any downtime is a flight risk. 因此,我希望确保付费客户始终能够访问其数据。So I want to make sure that paying customers are always able to access their data.

若要支持此方案,请通过将试用租户和付费租户放入不同的弹性池以将二者分开。To support this scenario, separate the trial tenants from paid tenants by putting them into separate elastic pools. 试用客户获得的每租户 eDTU 或 vCore 较低,SLA 较低且恢复需要更长的时间。The trial customers have lower eDTU or vCores per tenant and lower SLA with a longer recovery time. 付费客户会被分入具有较高的每租户 eDTU 或 vCore 的池中并且 SLA 较高。The paying customers are in a pool with higher eDTU or vCores per tenant and a higher SLA. 为了保证最短的恢复时间,请对付费客户的租户数据库进行异地复制。To guarantee the lowest recovery time, the paying customers' tenant databases are geo-replicated. 下图说明了此配置。This configuration is illustrated on the next diagram.

此图显示了一个主要区域和一个 DR 区域,它们在管理数据库与付费客户的主池和辅助池之间采用异地复制,而对于试用客户池则没有复制。

如方案一所示,管理数据库将会非常活跃,因此应该为其使用异地复制的单一数据库 (1)。As in the first scenario, the management databases are quite active so you use a single geo-replicated database for it (1). 这可以确保新的客户订阅、配置文件更新和其他管理操作具有可预测的性能。This ensures the predictable performance for new customer subscriptions, profile updates, and other management operations. 管理数据库的主数据库所在的区域将会成为主要区域,而管理数据库的辅助数据库所在的区域则会成为 DR 区域。The region in which the primaries of the management databases reside is the primary region and the region in which the secondaries of the management databases reside is the DR region.

付费客户的租户数据库将具备主要区域所预配的“付费”池中的活动数据库。The paying customers' tenant databases have active databases in the "paid" pool provisioned in the primary region. 预配与 DR 区域中的辅助池具有相同名称的辅助池。Provision a secondary pool with the same name in the DR region. 将每位租户异地复制到辅助池 (2)。Each tenant is geo-replicated to the secondary pool (2). 这会使用故障转移来实现快速恢复所有租户数据库。This enables quick recovery of all tenant databases using failover.

如果主要区域发生中断,可以使用下图中的恢复步骤恢复应用程序的在线状态:If an outage occurs in the primary region, the recovery steps to bring your application online are illustrated in the next diagram:

此图显示了主要区域发生服务中断,同时显示了故障转移到管理数据库的情况、付费客户辅助池,以及针对试用客户进行的创建和还原操作。

  • 立即将管理数据库故障转移到 DR 区域 (3)。Immediately fail over the management databases to the DR region (3).
  • 更改应用程序的连接字符串,使其指向 DR 区域。Change the application's connection string to point to the DR region. 现在会在 DR 区域中创建所有新帐户和租户数据库。Now all new accounts and tenant databases are created in the DR region. 现有试用客户的数据将暂时不可用。The existing trial customers see their data temporarily unavailable.
  • 将付费租户的数据库故障转移到 DR 区域的池中以立即还原其可用性 (4)。Fail over the paid tenant's databases to the pool in the DR region to immediately restore their availability (4). 由于故障转移是快速的元数据级更改,因此可以考虑使用优化,以便可以根据最终用户连接按需触发单个故障转移。Since the failover is a quick metadata level change, consider an optimization where the individual failovers are triggered on demand by the end-user connections.
  • 如果由于辅助数据库在它们是辅助数据库时仅需要用于处理更改日志的容量,从而致使辅助池 eDTU 大小或 vCore 值低于主池 eDTU 大小,请立即增加池容量以容纳所有租户的全部工作负荷 (5)。If your secondary pool eDTU size or vCore value was lower than the primary because the secondary databases only required the capacity to process the change logs while they were secondaries, immediately increase the pool capacity now to accommodate the full workload of all tenants (5).
  • 为试用客户的数据库创建与 DR 区域中的弹性池具有相同名称和配置的新弹性池 (6)。Create the new elastic pool with the same name and the same configuration in the DR region for the trial customers' databases (6).
  • 创建试用客户池后,请使用异地还原将各个试用租户数据库还原到新池 (7)。Once the trial customers' pool is created, use geo-restore to restore the individual trial tenant databases into the new pool (7). 考虑通过最终用户连接或使用其他应用程序的特定优先级方案来触发单个还原。Consider triggering the individual restores by the end-user connections or use some other application-specific priority scheme.

此时,应用程序便已在 DR 区域中恢复为在线状态。At this point your application is back online in the DR region. 所有付费客户均可访问其数据,而试用客户则会在访问数据时遇到延迟。All paying customers have access to their data while the trial customers experience delay when accessing their data.

在 DR 区域中还原了应用程序 之后 ,如果 Azure 恢复了主要区域,则可以决定在该区域继续运行应用程序或故障回复到主要区域。When the primary region is recovered by Azure after you have restored the application in the DR region you can continue running the application in that region or you can decide to fail back to the primary region. 如果在故障转移过程完成之前恢复了主要区域,请考虑立即进行故障回复。If the primary region is recovered before the failover process is completed, consider failing back right away. 此故障回复采用下图所示的步骤:The failback takes the steps illustrated in the next diagram:

此图显示了还原主要区域后要实施的故障回复步骤。

  • 取消所有未完成的异地还原请求。Cancel all outstanding geo-restore requests.
  • 故障转移管理数据库 (8)。Fail over the management databases (8). 区域恢复后,旧的主数据库便已自动成为辅助数据库。After the region's recovery, the old primary automatically become the secondary. 现在,它再次成为主数据库。Now it becomes the primary again.
  • 故障转移付费租户数据库 (9)。Fail over the paid tenant databases (9). 同样,区域恢复后,旧的主数据库便会自动成为辅助数据库。Similarly, after the region's recovery, the old primaries automatically become the secondaries. 现在,它们将再次成为主数据库。Now they become the primaries again.
  • 将 DR 区域中已更改的还原试用数据库设置为只读 (10)。Set the restored trial databases that have changed in the DR region to read-only (10).
  • 对于自恢复以来已更改的试用客户 DR 池中的每个数据库,重命名或删除试用客户主池中的相应数据库 (11)。For each database in the trial customers DR pool that changed since the recovery, rename or delete the corresponding database in the trial customers primary pool (11).
  • 将 DR 池中已更新的数据库复制到主池 (12)。Copy the updated databases from the DR pool to the primary pool (12).
  • 删除 DR 池 (13)。Delete the DR pool (13).

备注

故障转移是异步操作。The failover operation is asynchronous. 为了最大限度缩短恢复时间,请务必按每批至少 20 个数据库执行租户数据库的故障转移命令。To minimize the recovery time it is important that you execute the tenant databases' failover command in batches of at least 20 databases.

优势Benefit

该策略的主要优点在于它为付费客户提供了最高的 SLA。The key benefit of this strategy is that it provides the highest SLA for the paying customers. 它还确保一旦创建了试用 DR 池,系统会取消阻止新试用。It also guarantees that the new trials are unblocked as soon as the trial DR pool is created.

权衡Trade-off

权衡是指此设置需对付费客户的辅助 DR 池成本进行收费从而增加租户数据库的总成本。The trade-off is that this setup increases the total cost of the tenant databases by the cost of the secondary DR pool for paid customers. 此外,如果辅助池大小不同,付费客户会在故障转移后体验较低性能,直到完成 DR 区域的池升级才能恢复往常性能。In addition, if the secondary pool has a different size, the paying customers experience lower performance after failover until the pool upgrade in the DR region is completed.

方案 3.Scenario 3. 具有分层服务的地理分布式应用程序Geographically distributed application with tiered service

我在使用一款具有分层服务功能的成熟 SaaS 应用程序。I have a mature SaaS application with tiered service offers. 我希望向付费客户提供极高性能的 SLA,并使中断发生时所带来的影响风险降到最低,因为即使是短暂的中断也会导致客户不满。I want to offer a very aggressive SLA to my paid customers and minimize the risk of impact when outages occur because even brief interruption can cause customer dissatisfaction. 付费客户始终可以访问其数据,这一点至关重要。It is critical that the paying customers can always access their data. 试用都是免费的,试用期间不会提供 SLA。The trials are free and an SLA is not offered during the trial period.

若要支持此方案,请使用三个单独的弹性池。To support this scenario, use three separate elastic pools. 在两个不同区域预配两个相同大小的池,使其中的每个数据库都具有较高的 eDTU 或 vCore,以容纳付费客户的租户数据库。Provision two equal size pools with high eDTUs or vCores per database in two different regions to contain the paid customers' tenant databases. 在这两个区域的任一个中预配包含试用租户的第三个池,使其中的每个数据库都具有较低的 eDTU 或 vCore。The third pool containing the trial tenants can have lower eDTUs or vCores per database and be provisioned in one of the two regions.

为了保证中断期间最短的恢复时间,请使用这两个区域的任一个主数据库的 50% 对付费客户的租户数据库进行异地复制。To guarantee the lowest recovery time during outages, the paying customers' tenant databases are geo-replicated with 50% of the primary databases in each of the two regions. 同样,每个区域会占用辅助数据库的 50%。Similarly, each region has 50% of the secondary databases. 这样,如果区域处于离线状态,则只有 50% 付费客户的数据库会受到影响且必须进行故障转移。This way, if a region is offline, only 50% of the paid customers' databases are impacted and have to fail over. 其他数据库将保持不变。The other databases remain intact. 下图说明了此配置:This configuration is illustrated in the following diagram:

此图显示了一个名为“区域 A”的主要区域和一个名为“区域 B”的次要区域,它们在管理数据库与付费客户的主池和辅助池之间采用异地复制,而对于试用客户池则没有复制。

如前面的方案所示,管理数据库将会非常活跃,因此应该将其配置为异地复制的单一数据库 (1)。As in the previous scenarios, the management databases are quite active so configure them as single geo-replicated databases (1). 这可以确保新的客户订阅、配置文件更新和其他管理操作具有可预测的性能。This ensures the predictable performance of the new customer subscriptions, profile updates and other management operations. 区域 A 将是管理数据库的主要区域,而区域 B 将用于恢复管理数据库。Region A is the primary region for the management databases and the region B is used for recovery of the management databases.

同样会对付费客户的租户数据库进行异地复制,但将在区域 A 和区域 B 之间拆分主数据库和辅助数据库 (2)。The paying customers' tenant databases are also geo-replicated but with primaries and secondaries split between region A and region B (2). 这样,受中断影响的租户主要数据库就可以故障转移到其他区域并变得可用。This way, the tenant primary databases impacted by the outage can fail over to the other region and become available. 租户数据库的另一半不会受到任何影响。The other half of the tenant databases are not be impacted at all.

下图说明了区域 A 发生中断时要采取的恢复步骤。The next diagram illustrates the recovery steps to take if an outage occurs in region A.

此图显示了主要区域发生服务中断,同时显示了故障转移到管理数据库的情况、付费客户辅助池,以及针对试用客户进行的创建操作和还原到区域 B 的操作。

  • 立即将管理数据库故障转移到区域 B (3)。Immediately fail over the management databases to region B (3).
  • 更改应用程序的连接字符串,使其指向区域 B 的管理数据库。修改管理数据库以确保将在区域 B 中创建新的帐户和租户数据库,同时确保可在此处找到现有租户数据库。Change the application's connection string to point to the management databases in region B. Modify the management databases to make sure the new accounts and tenant databases are created in region B and the existing tenant databases are found there as well. 现有试用客户的数据将暂时不可用。The existing trial customers see their data temporarily unavailable.
  • 将付费租户的数据库故障转移到区域 B 的池 2 中以立即还原其可用性 (4)。Fail over the paid tenant's databases to pool 2 in region B to immediately restore their availability (4). 由于故障转移是快速的元数据级更改,因此请考虑使用优化,以便可以根据最终用户连接按需触发单个故障转移。Since the failover is a quick metadata level change, you may consider an optimization where the individual failovers are triggered on demand by the end-user connections.
  • 从现在开始,池 2 仅包含主数据库,池中的总工作负荷将增加,因此需立即增加其 eDTU 大小 (5) 或 vCore 数。Since now pool 2 contains only primary databases, the total workload in the pool increases and can immediately increase its eDTU size (5) or number of vCores.
  • 为试用客户的数据库创建与区域 B 中的弹性池具有相同名称和配置的新弹性池 (6)。Create the new elastic pool with the same name and the same configuration in the region B for the trial customers' databases (6).
  • 创建新弹性池后,请使用异地还原将各个试用租户数据库还原到该池 (7)。Once the pool is created use geo-restore to restore the individual trial tenant database into the pool (7). 可以考虑通过最终用户连接或使用其他应用程序的特定优先级方案来触发单个还原。You can consider triggering the individual restores by the end-user connections or use some other application-specific priority scheme.

备注

故障转移是异步操作。The failover operation is asynchronous. 为了最大限度缩短恢复时间,请务必按每批至少 20 个数据库执行租户数据库的故障转移命令。To minimize the recovery time, it is important that you execute the tenant databases' failover command in batches of at least 20 databases.

此时,应用程序便已在区域 B 中恢复为在线状态。所有付费客户均可访问其数据,而试用客户则将在访问数据时遇到延迟。At this point your application is back online in region B. All paying customers have access to their data while the trial customers experience delay when accessing their data.

恢复区域 A 时,需要决定想要为试用客户使用区域 B 还是进行故障回复以在区域 A 中使用试用客户池。其中一个条件就是自恢复以来修改的试用租户数据库的百分比。When region A is recovered you need to decide if you want to use region B for trial customers or failback to using the trial customers pool in region A. One criteria could be the % of trial tenant databases modified since the recovery. 不管做出何种决定,都需要重新均衡两个池的付费租户。Regardless of that decision, you need to re-balance the paid tenants between two pools. 下图说明了当试用租户数据库故障回复到区域 A 的过程。the next diagram illustrates the process when the trial tenant databases fail back to region A.

此图显示了还原区域 A 后要实施的故障回复步骤。

  • 取消所有未完成的异地还原请求以试用 DR 池。Cancel all outstanding geo-restore requests to trial DR pool.
  • 故障转移管理数据库 (8)。Fail over the management database (8). 区域恢复后,旧的主数据库便已自动成为辅助数据库。After the region's recovery, the old primary automatically became the secondary. 现在,它再次成为主数据库。Now it becomes the primary again.
  • 选择将故障转移到池 1 并启动故障转移到其辅助数据库的付费租户数据库 (9)。Select which paid tenant databases fail back to pool 1 and initiate failover to their secondaries (9). 区域恢复后,池 1 中的所有数据库便会自动成为辅助数据库。After the region's recovery, all databases in pool 1 automatically became secondaries. 现在,其中的 50 % 都再次成为主数据库。Now 50% of them become primaries again.
  • 将池 2 的大小减小到原始 eDTU 大小 (10) 或 vCore 数。Reduce the size of pool 2 to the original eDTU (10) or number of vCores.
  • 将区域 B 中的所有已还原试用数据库设置为只读 (11)。Set all restored trial databases in the region B to read-only (11).
  • 对于自恢复以来已更改的试用 DR 池中的每个数据库,重命名或删除试用主池中的相应数据库 (12)。For each database in the trial DR pool that has changed since the recovery, rename or delete the corresponding database in the trial primary pool (12).
  • 将 DR 池中已更新的数据库复制到主池 (13)。Copy the updated databases from the DR pool to the primary pool (13).
  • 删除 DR 池 (14)。Delete the DR pool (14).

优势Benefit

该策略的主要优点有:The key benefits of this strategy are:

  • 它支持针对付费客户的最佳性能的 SLA,因为它可确保中断对租户数据库产生的影响不会超过 50%。It supports the most aggressive SLA for the paying customers because it ensures that an outage cannot impact more than 50% of the tenant databases.
  • 它还确保恢复期间一旦创建了试用 DR 池,系统将取消阻止新试用。It guarantees that the new trials are unblocked as soon as the trail DR pool is created during the recovery.
  • 它实现了更有效地使用池容量,因为其已确保池 1 和池 2 中的 50% 辅助数据库活动量均少于主数据库的活动量。It allows more efficient use of the pool capacity as 50% of secondary databases in pool 1 and pool 2 are guaranteed to be less active than the primary databases.

权衡Trade-offs

主要权衡有:The main trade-offs are:

  • 针对管理数据库的 CRUD 操作延迟,连接到区域 A 的最终用户比连接到区域 B 的最终用户遇到的延迟时间更短,因为他们将对管理数据库的主数据库执行该操作。The CRUD operations against the management databases have lower latency for the end users connected to region A than for the end users connected to region B as they are executed against the primary of the management databases.
  • 它需要对管理数据库进行更复杂的设计。It requires more complex design of the management database. 例如,每个租户记录具有在故障转移和故障回复期间进行更改的位置标记。For example, each tenant record has a location tag that needs to be changed during failover and failback.
  • 完成区域 B 的池升级之前,付费客户可能遇到比平常更低性能。The paying customers may experience lower performance than usual until the pool upgrade in region B is completed.

摘要Summary

本文重点介绍了关于 SaaS ISV 多租户应用程序使用的数据库层的灾难恢复策略。This article focuses on the disaster recovery strategies for the database tier used by a SaaS ISV multi-tenant application. 基于应用程序的需要选择策略,例如业务模式、想要为客户提供的 SLA、预算限制等。所述的每个策略都概述了其优点和权衡,以便可以做出明智的决策。The strategy you choose is based on the needs of the application, such as the business model, the SLA you want to offer to your customers, budget constraint etc. Each described strategy outlines the benefits and trade-off so you could make an informed decision. 此外,特定应用程序可能包括其他 Azure 组件。Also, your specific application likely includes other Azure components. 因此,请查看其业务连续性指南并根据指南安排数据库层的恢复。So you review their business continuity guidance and orchestrate the recovery of the database tier with them. 若要深入了解如何管理 Azure 中的数据库应用程序恢复,请参阅设计灾难恢复云解决方案To learn more about managing recovery of database applications in Azure, refer to Designing cloud solutions for disaster recovery.

后续步骤Next steps