将 ExpressRoute 与 Azure VM 的灾难恢复集成Integrate ExpressRoute with disaster recovery for Azure VMs

本文介绍在将 Azure VM 的灾难恢复设置为次要 Azure 区域时,如何将 Azure ExpressRoute 与 Azure Site Recovery 相集成。This article describes how to integrate Azure ExpressRoute with Azure Site Recovery, when you set up disaster recovery for Azure VMs to a secondary Azure region.

Site Recovery 通过将 Azure VM 数据复制到 Azure 来实现 Azure VM 的灾难恢复。Site Recovery enables disaster recovery of Azure VMs by replicating Azure VM data to Azure.

  • 如果 Azure VM 使用 Azure 托管磁盘,则 VM 数据将复制到次要区域中复制的托管磁盘。If Azure VMs use Azure managed disks, VM data is replicated to an replicated managed disk in the secondary region.
  • 如果 Azure VM 没有使用托管磁盘,则 VM 数据将复制到 Azure 存储帐户。If Azure VMs don't use managed disks, VM data is replicated to an Azure storage account.
  • 复制终结点是公共终结点,但 Azure VM 的复制流量不会通过 Internet。Replication endpoints are public, but replication traffic for Azure VMs doesn't cross the internet.

使用 ExpressRoute 可通过连接服务提供商所提供的专用连接,将本地网络扩展到 Azure 云。ExpressRoute enables you to extend on-premises networks into the Azure cloud over a private connection, facilitated by a connectivity provider. 如果配置了 ExpressRoute,则它与 Site Recovery 的集成如下所示:If you have ExpressRoute configured, it integrates with Site Recovery as follows:

  • 在 Azure 区域之间进行复制期间:Azure VM 灾难恢复的复制流量仅存在于 Azure 中,无需 ExpressRoute 或不使用 ExpressRoute 进行复制。During replication between Azure regions: Replication traffic for Azure VM disaster recovery is within Azure only, and ExpressRoute isn't needed or used for replication. 但是,如果从本地站点连接到主要 Azure 站点中的 Azure VM,则在为这些 Azure VM 设置灾难恢复时,需要注意许多问题。However, if you're connecting from an on-premises site to the Azure VMs in the primary Azure site, there are a number of issues to be aware of when you're setting up disaster recovery for those Azure VMs.
  • Azure 区域之间的故障转移:在服务发生中断时,可将 Azure VM 从主要 Azure 区域故障转移到次要 Azure 区域。Failover between Azure regions: When outages occur, you fail over Azure VMs from the primary to secondary Azure region. 在故障转移到次要区域后,要使用 ExpressRoute 访问次要区域中的 Azure VM,需执行多个步骤。After failing over to a secondary region, there are a number of steps to take in order to access the Azure VMs in the secondary region using ExpressRoute.

开始之前Before you begin

在开始之前,请确保了解以下概念:Before you begin, make sure you understand the following concepts:

一般建议General recommendations

为采取最佳做法,并确保有效的灾难恢复恢复时间目标 (RTO),建议在设置 Site Recovery 以与 ExpressRoute 集成时执行以下操作:For best practice, and to ensure efficient Recovery Time Objectives (RTOs) for disaster recovery, we recommend you do the following when you set up Site Recovery to integrate with ExpressRoute:

  • 在故障转移到次要区域前预配网络组件:Provision networking components before failover to a secondary region:
    • 为 Azure VM 启用复制时,Site Recovery 可以根据源网络设置自动部署目标 Azure 区域中的网络资源,如网络、子网和网关。When you enable replication for Azure VMs, Site Recovery can automatically deploy networking resources such as networks, subnets, and gateways in the target Azure region, based on source network settings.
    • Site Recovery 无法自动设置网络资源,如 VPN 网关。Site Recovery can't automatically set up networking resources such as VPN Gateways.
    • 建议在故障转移前预配这些额外的网络资源。We recommend you provision these additional networking resources before failover. 此部署会导致较短的停机时间,如果在部署规划期间未考虑此停机时间,则它可能会影响整体恢复时间。A small downtime is associated with this deployment, and it can impact the overall recovery time, if you didn't account for it during deployment planning.
  • 定期进行灾难恢复演练:Run regular disaster recovery drills:
    • 演练如何在不丢失数据或不停机的情况下验证复制策略,并且不影响生产环境。A drill validates your replication strategy without data loss or downtime, and doesn't affect your production environment. 它有助于避免可能对 RTO 产生负面影响的最新配置问题。It helps avoid last-minute configuration issues that can adversely impact RTO.
    • 在演练测试故障转移时,建议使用单独的 Azure VM 网络,而不是启用复制时设置的默认网络。When you run a test failover for the drill, we recommend that you use a separate Azure VM network, instead of the default network that's set up when you enable replication.
  • 如果具有单一 ExpressRoute 线路,请使用不同的 IP 地址空间。Use different IP address spaces if you have a single ExpressRoute circuit.
    • 建议为目标虚拟网络使用不同的 IP 地址空间。We recommend that you use a different IP address space for the target virtual network. 这可避免在区域中断期间建立连接时出现的问题。This avoids issues when establishing connections during regional outages.
    • 如果无法使用单独的地址空间,请确保在具有不同 IP 地址的独立测试网络上运行灾难恢复演练测试故障转移。If you can't use a separate address space, be sure to run the disaster recovery drill test failover on a separate test network with different IP addresses. 无法将具有重叠 IP 地址空间的两个 VNet 连接到同一 ExpressRoute 线路。You can't connect two VNets with overlapping IP address space to the same ExpressRoute circuit.

使用 ExpressRoute 时复制 Azure VMReplicate Azure VMs when using ExpressRoute

如果要在主站点中为 Azure VM 设置复制,并且要通过 ExpressRoute 从本地站点连接到这些 VM,则需要执行以下操作:If you want to set up replication for Azure VMs in a primary site, and you're connecting to these VMs from your on-premises site over ExpressRoute, here's what you need to do:

  1. 为每个 Azure VM 启用复制Enable replication for each Azure VM.
  2. (可选)让 Site Recovery 设置网络:Optionally let Site Recovery set up networking:
    • 配置和启用复制时,Site Recovery 会在目标 Azure 区域中设置网络、子网和网关子网,以匹配源区域中的网络、子网和网关子网。When you configure and enable replication, Site Recovery sets up networks, subnets, and gateway subnets in the target Azure region, to match those in the source region. 此外,Site Recovery 还会在源虚拟网络和目标虚拟网络之间进行映射。Site Recovery also maps between the source and target virtual networks.
    • 如果不希望 Site Recovery 自动执行此操作,请在启用复制前创建目标端网络资源。If you don't want Site Recovery to do this automatically, create the target-side network resources before you enable replication.
  3. 创建其他网络元素:Create other networking elements:
    • Site Recovery 不会在次要区域中创建路由表、VPN 网关、VPN 网关连接、VNet 对等互连或其他网络资源和连接。Site Recovery doesn't create route tables, VPN Gateways, VPN Gateway connections, VNet peering, or other networking resources and connections in the secondary region.
    • 在从主要区域运行故障转移之前,需要在次要区域中创建这些其他网络元素。You need to create these additional networking elements in the secondary region, any time before running a failover from the primary region.
    • 可以使用恢复计划和自动化脚本来设置和连接这些网络资源。You can use recovery plans and automation scripts to set up and connect these networking resources.
  4. 如果为控制网络流量而部署了网络虚拟设备 (NVA),请注意:If you have a network virtual appliance (NVA) deployed to control the flow of network traffic, note that:
    • Azure 的 Azure VM 复制的默认系统路由为 0.0.0.0/0。Azure's default system route for Azure VM replication is 0.0.0.0/0.
    • 通常,NVA 部署还会定义默认路由 (0.0.0.0/0),用于强制出站 Internet 流量流经 NVA。Typically, NVA deployments also define a default route (0.0.0.0/0) that forces outbound Internet traffic to flow through the NVA. 如果找不到其他特定路由配置,则将使用默认路由。The default route is used when no other specific route configuration can be found.
    • 在这种情况下,如果所有复制流量都通过 NVA,则 NVA 可能会过载。If this is the case, the NVA might be overloaded if all replication traffic passes through the NVA.
    • 使用默认路由将所有 Azure VM 流量路由到本地部署时,也存在这种限制。The same limitation also applies when using default routes for routing all Azure VM traffic to on-premises deployments.
    • 在本方案中,建议在虚拟网络中为 Microsoft.Storage 服务创建网络服务终结点,以便复制流量不会离开 Azure 边界。In this scenario, we recommend that you create a network service endpoint in your virtual network for the Microsoft.Storage service, so that the replication traffic doesn't leave Azure boundary.

复制示例Replication example

通常,企业部署会在多个 Azure VNet 之间拆分工作负荷,其中央连接中心用于与 Internet 和本地站点建立外部连接。Typically enterprise deployments have workloads split across multiple Azure VNets, with a central connectivity hub for external connectivity to the internet and to on-premises sites. 中心和分支拓扑通常与 ExpressRoute 一起使用。A hub and spoke topology is typically used together with ExpressRoute.

故障转移之前使用 ExpressRoute 建立本地到 Azure 的连接

  • 区域Region. 应用部署在 Azure 中国东部区域。Apps are deployed in the Azure China East region.
  • 分支 VNetSpoke vNets. 应用部署在两个分支 vNet 中:Apps are deployed in two spoke vNets:
    • 源 vNet1:10.1.0.0/24。Source vNet1: 10.1.0.0/24.
    • 源 vNet2:10.2.0.0/24。Source vNet2: 10.2.0.0/24.
    • 每个分支虚拟网络都连接到“中心 vNet” 。Each spoke virtual network is connected to Hub vNet.
  • 中心 vNetHub vNet. 中心 vNet“源中心 vNet”:10.10.10.0/24 。There's a hub vNet Source Hub vNet: 10.10.10.0/24.
    • 此中心 vNet 充当网关守卫。This hub vNet acts as the gatekeeper.
    • 跨子网的所有通信都通过此中心进行。All communications across subnets go through this hub.
      • 中心 vNet 子网Hub vNet subnets. 中心 vNet 具有两个子网:The hub vNet has two subnets:
      • NVA 子网:10.10.10.0/25。NVA subnet: 10.10.10.0/25. 此子网包含 NVA (10.10.10.10)。This subnet contains an NVA (10.10.10.10).
      • 网关子网:10.10.10.128/25。Gateway subnet: 10.10.10.128/25. 此子网包含连接到 ExpressRoute 连接的 ExpressRoute 网关,该连接通过专用对等互连路由域路由到本地站点。This subnet contains an ExpressRoute gateway connected to an ExpressRoute connection that routes to the on-premises site via a private peering routing domain.
  • 本地数据中心通过广州的合作伙伴边缘建立了 ExpressRoute 线路连接。The on-premises datacenter has an ExpressRoute circuit connection through a partner edge in Guang Zhou.
  • 所有路由都通过 Azure 路由表 (UDR) 进行控制。All routing is controlled through Azure route tables (UDR).
  • vNet 之间或流向本地数据中心的所有出站流量都经过 NVA 路由。All outbound traffic between vNets, or to the on-premises datacenter is routed through the NVA.

中心和分支对等互连设置Hub and spoke peering settings

分支到中心Spoke to hub

方向Direction 设置Setting StateState
分支到中心Spoke to hub 允许虚拟网络地址Allow virtual network address 已启用Enabled
分支到中心Spoke to hub 允许转发流量Allow forwarded traffic 已启用Enabled
分支到中心Spoke to hub 允许网关传输Allow gateway transit 已禁用Disabled
分支到中心Spoke to hub 使用删除网关Use remove gateways 已启用Enabled

分支到中心对等互连配置

中心到分支Hub to spoke

方向Direction 设置Setting StateState
中心到分支Hub to spoke 允许虚拟网络地址Allow virtual network address 已启用Enabled
中心到分支Hub to spoke 允许转发流量Allow forwarded traffic 已启用Enabled
中心到分支Hub to spoke 允许网关传输Allow gateway transit 已启用Enabled
中心到分支Hub to spoke 使用删除网关Use remove gateways 已禁用Disabled

中心到分支对等互连配置

示例步骤Example steps

在本示例中,当在源网络中为 Azure VM 启用复制时,将发生以下情况:In our example, the following should happen when enabling replication for Azure VMs in the source network:

  1. 为 VM 启用复制You enable replication for a VM.
  2. Site Recovery 将在目标区域中创建副本 vNet、子网和网关子网。Site Recovery will create replica vNets, subnets, and gateway subnets in the target region.
  3. Site Recovery 将在源网络和它创建的副本目标网络之间创建映射。Site Recovery creates mappings between the source networks and the replica target networks it creates.
  4. 手动创建虚拟网络网关、虚拟网络网关连接、虚拟网络对等互连,或者其他任何网络资源或连接。You manually create virtual network gateways, virtual network gateway connections, virtual network peering, or any other networking resources or connections.

在使用 ExpressRoute 时对 Azure VM 进行故障转移Fail over Azure VMs when using ExpressRoute

在使用 Site Recovery 将 Azure VM 故障转移到目标 Azure 区域后,可以使用 ExpressRoute 专用对等互连对其进行访问。After you fail Azure VMs over to the target Azure region using Site Recovery, you can access them using ExpressRoute private peering.

  • 需要使用新连接将 ExpressRoute 连接到目标 vNet。You need to connect ExpressRoute to the target vNet with a new connection. 现有的 ExpressRoute 连接不会自动传输。The existing ExpressRoute connection isn't automatically transferred.
  • 将 ExpressRoute 连接设置为目标 vNet 的方式取决于 ExpressRoute 拓扑。The way in which you set up your ExpressRoute connection to the target vNet depends on your ExpressRoute topology.

使用两条线路进行访问Access with two circuits

具有两个对等互连位置的两条线路Two circuits with two peering locations

此配置有助于保护 ExpressRoute 线路免遭区域性灾难。This configuration helps protects ExpressRoute circuits against regional disaster. 如果主要对等互连位置发生故障,可以从其他位置继续连接。If your primary peering location goes down, connections can continue from the other location.

  • 连接到生产环境的线路通常为主要线路。The circuit connected to the production environment is usually the primary. 次要线路的带宽通常较低,如果发生灾难,可以增加此带宽。The secondary circuit typically has lower bandwidth, which can be increased if a disaster occurs.
  • 故障转移后,可以建立从次要 ExpressRoute 线路到目标 vNet 的连接。After failover, you can establish connections from the secondary ExpressRoute circuit to the target vNet. 或者,可以在发生灾难时设置并准备好连接,以减少总体恢复时间。Alternatively, you can have connections set up and ready in case of disaster, to reduce overall recovery time.
  • 在同时连接到主要和目标 vNet 时,请确保本地路由仅在故障转移后才使用次要线路和连接。With simultaneous connections to both primary and target vNets, make sure that your on-premises routing only uses the secondary circuit and connection after failover.
  • 故障转移后,源和目标 vNet 可以接收新的 IP 地址,或保留相同的 IP 地址。The source and target vNets can receive new IP addresses, or keep the same ones, after failover. 在这两种情况下,都可在故障转移之前建立次要连接。In both cases, the secondary connections can be established prior to failover.

具有单个对等互连位置的两条线路Two circuits with single peering location

此配置有助于防止主要 ExpressRoute 线路出现故障,但如果单个 ExpressRoute 对等互连位置发生故障,则会影响两条线路。This configuration helps protect against failure of the primary ExpressRoute circuit, but not if the single ExpressRoute peering location goes down, impacting both circuits.

  • 可以从本地数据中心同时使用主要线路连接到源 vNEt 和使用次要线路连接到目标 vNet。You can have simultaneous connections from the on-premises datacenter to source vNEt with the primary circuit, and to the target vNet with the secondary circuit.
  • 在同时连接到主要和目标 vNet 时,请确保本地路由仅在故障转移后才使用次要线路和连接。With simultaneous connections to primary and target, make sure that on-premises routing only uses the secondary circuit and connection after failover.
  • 如果在同一对等互连位置创建了两条线路,则无法将这两条线路连接到同一 vNet。You can't connect both circuits to the same vNet when circuits are created at the same peering location.

使用单条线路进行访问Access with a single circuit

在此配置中,只有一条 Expressroute 线路。In this configuration there's only one Expressroute circuit. 虽然在某条线路出现故障的情况下,线路具有冗余连接,但如果对等互连区域发生故障,则单条路由线路将无法提供恢复能力。Although the circuit has a redundant connection in case one goes down, a single route circuit will not provide resilience if your peering region goes down. 请注意:Note that:

  • 可以将 Azure VM 复制到相同地理位置中的任何 Azure 区域。You can replicate Azure VMs to any Azure region in the same geographic location. 如果目标 Azure 区域与源不在同一位置,并且使用的是单条 ExpressRoute 线路,则需要启用 ExpressRoute 高级版。If the target Azure region isn't in the same location as the source, you need to enable ExpressRoute Premium if you're using a single ExpressRoute circuit. 请了解 ExpressRoute 位置ExpressRoute 定价Learn about ExpressRoute locations and ExpressRoute pricing.
  • 如果在目标区域使用了相同的 IP 地址空间,则无法将源和目标 vNet 同时连接到线路。You can't connect source and target vNets simultaneously to the circuit if the same IP address space is used on the target region. 在本方案中:In this scenario:
    • 断开源侧连接,然后建立目标侧连接。Disconnect the source side connection, and then establish the target side connection. 可在 Site Recovery 恢复计划中编写此连接变更的脚本。This connection change can be scripted as part of a Site Recovery recovery plan. 请注意:Note that:
      • 在发生区域故障时,如果主要区域不可访问,则断开连接操作可能失败。In a regional failure, if the primary region is inaccessible, the disconnect operation could fail. 这可能会影响到目标区域的连接创建。This could impact connection creation to the target region.
      • 如果在目标区域中创建了连接,并且主要区域稍后已恢复,在同时连接尝试连接到同一地址空间的情况下,可能会出现丢包的情况。If you created the connection in the target region, and primary region recovers later, you might experience packet drops if two simultaneous connections attempt to connect to the same address space.
      • 若要防止发生此情况,请立即终止主要连接。To prevent this, terminate the primary connection immediately.
      • 在 VM 故障回复到主要区域后,可以在断开次要连接后再次建立主要连接。After VM failback to the primary region, the primary connection can again be established, after you disconnect the secondary connection.
  • 如果在目标 vNet 上使用不同的地址空间,则可以从同一 ExpressRoute 线路同时连接到源和目标 vNet。If a different address spaces is used on the target vNet, you can simultaneously connect to the source and target vNets from the same ExpressRoute circuit.

故障转移示例Failover example

在本示例中,我们使用以下拓扑:In our example, we're using the following topology:

  • 两个不同对等互连位置中的两条不同 ExpressRoute 线路。Two different ExpressRoute circuits in two different peering locations.

  • 故障转移后保留 Azure VM 的专用 IP 地址。Retain private IP addresses for the Azure VMs after failover.

  • 目标恢复区域是 Azure 中国北部。The target recovery region is Azure China North.

  • ExpressRoute 次要线路连接是通过天津的合作伙伴边缘建立的。A secondary ExpressRoute circuit connection is established through a partner edge in Tian Jing.

有关在故障转移后使用具有相同 IP 地址的单条 ExpressRoute 线路的简单拓扑,请查看本文For a simple topology that uses a single ExpressRoute circuit, with same IP address after failover, review this article.

示例步骤Example steps

若要在本示例中自动执行恢复,需要执行以下操作:To automate recovery in this example, here's what you need to do:

  1. 按照设置复制的步骤执行操作。Follow the steps to set up replication.

  2. 在故障转移期间或之后按照这些附加步骤对 Azure VM 进行故障转移Fail over the Azure VMs, with these additional steps during or after the failover.

    a.a. 在目标区域中心 VNet 内创建 Azure ExpressRoute 网关。Create the Azure ExpressRoute Gateway in the target region hub VNet. 这需要将目标中心 vNet 连接到 ExpressRoute 线路。This is need to connect the target hub vNet to the ExpressRoute circuit.

    b.b. 创建从目标中心 vNet 到目标 ExpressRoute 线路的连接。Create the connection from the target hub vNet to the target ExpressRoute circuit.

    c.c. 在目标区域的中心与分支虚拟网络之间设置 VNet 对等互连。Set up the VNet peerings between the target region's hub and spoke virtual networks. 目标区域中的对等互连属性与源区域中的属性相同。The peering properties on the target region will be the same as those on the source region.

    d.d. 在中心 VNet 和两个分支 VNet 中设置 UDR。Set up the UDRs in the hub VNet, and the two spoke VNets.

    • 使用相同的 IP 地址时,目标端 UDR 的属性与源端中的属性相同。The properties of the target side UDRs are the same as those on the source side when using the same IP addresses.
    • 使用不同的目标 IP 地址时,应相应地修改 UDR。With different target IP addresses, the UDRs should be modified accordingly.

可在恢复计划中编写上述步骤的脚本。The above steps can be scripted as part of a recovery plan. 根据应用程序连接和恢复时间要求,也可以在开始故障转移之前完成上述步骤。Depending on the application connectivity and recovery time requirements, the above steps can also be completed prior to starting the failover.

灾难恢复后After recovery

恢复 VM 并完成连接后,恢复环境如下。After recovering the VMs and completing connectivity, the recovery environment is as follows.

故障转移之后使用 ExpressRoute 建立本地到 Azure 的连接

后续步骤Next steps

详细了解如何使用恢复计划自动执行应用故障转移。Learn more about using recovery plans to automate app failover.