为 VMware VM 的重新保护和故障回复做好准备Prepare for reprotection and failback of VMware VMs

将本地 VMware VM 或物理服务器故障转移到 Azure 后,可以重新保护故障转移后创建的 Azure VM,以便将其复制回到本地站点。After failover of on-premises VMware VMs or physical servers to Azure, you reprotect the Azure VMs created after failover, so that they replicate back to the on-premises site. 启用从 Azure 到本地的复制后,可以在准备就绪时通过运行从 Azure 到本地的故障转移来进行故障回复。With replication from Azure to on-premises in place, you can then fail back by running a failover from Azure to on-premises when you're ready.

重新保护/故障回复组件Reprotection/failback components

在重新保护并从 Azure 故障回复之前,需要准备好多个组件和设置。You need a number of components and settings in place before you can reprotect and fail back from Azure.

组件 Component 详细信息Details
本地配置服务器On-premises configuration server 本地配置服务器必须正在运行且已连接到 Azure。The on-premises configuration server must be running and connected to Azure.

要故障回复到的 VM 必须在配置服务器数据库中存在。The VM you're failing back to must exist in the configuration server database. 如果灾难影响了配置服务器,请使用相同的 IP 地址还原该服务器,以确保故障回复正常工作。If disaster affects the configuration server, restore it with the same IP address to ensure that failback works.

如果故障转移时保留了复制的计算机的 IP 地址,则应在 Azure VM 计算机与配置服务器的故障回复 NIC 之间建立站点到站点连接(或 ExpressRoute 连接)。If IP addresses of replicated machines were retained on failover, site-to-site connectivity (or ExpressRoute connectivity) should be established between Azure VMs machines and the failback NIC of the configuration server. 对于保留的 IP 地址,配置服务器需要两个 NIC - 一个用于源计算机连接,另一个用于 Azure 故障回复连接。For retained IP addresses the configuration server needs two NICs - one for source machine connectivity, and one for Azure failback connectivity. 这可以避免源 VM 与故障转移的 VM 的子网地址范围重叠。This avoids overlap of subnet address ranges for the source and failed over VMs.
Azure 中的进程服务器Process server in Azure 在故障回复到本地站点之前,需在 Azure 中部署一个进程服务器。You need a process server in Azure before you can fail back to your on-premises site.

进程服务器从受保护的 Azure VM 接收数据,并将其发送到本地站点。The process server receives data from the protected Azure VM, and sends it to the on-premises site.

需在进程服务器与受保护 VM 之间使用低延迟网络,因此,我们建议在 Azure 中部署进程服务器以提高复制性能。You need a low-latency network between the process server and the protected VM, so we recommend that you deploy the process server in Azure for higher replication performance.

如需概念证明,可将本地进程服务器与 ExpressRoute 一起用于专用对等互连。For proof-of-concept, you can use the on-premises process server, and ExpressRoute with private peering.

进程服务器应位于故障转移的 VM 所在的 Azure 网络中。The process server should be in the Azure network in which the failed over VM is located. 进程服务器还必须能够与本地配置服务器和主目标服务器通信。The process server must also be able to communicate with the on-premises configuration server and master target server.
独立的主目标服务器Separate master target server 主目标服务器接收故障回复数据。默认情况下,Windows 主目标服务器将在本地配置服务器上运行。The master target server receives failback data, and by default a Windows master target server runs on the on-premises configuration server.

最多可将 60 个磁盘附加到一个主目标服务器。A master target server can have up to 60 disks attached to it. 如果要故障回复的 VM 包含的磁盘总共超过 60 个,或者你要故障回复大量的流量,请创建独立的主目标服务器用于故障回复。VMs being failed back have more than a collective total of 60 disks, or if you're failing back large volumes of traffic, create a separate master target server for failback.

如果将计算机收集到复制组以实现多 VM 一致性,这些 VM 必须全部运行 Windows,或者全部运行 Linux。If machines are gathered into a replication group for multi-VM consistency, the VMs must all be Windows, or must all be Linux. 为什么?Why? 因为复制组中的所有 VM 必须使用同一个主目标服务器,而主目标服务器必须采用与复制的计算机相同的操作系统(且版本相同或更高)。Because all VMs in a replication group must use the same master target server, and the master target server must have same operating system (With the same or a higher version) than those of the replicated machines.

主目标服务器的磁盘中不应包含任何快照,否则重新保护和故障回复无法正常进行。The master target server shouldn't have any snapshots on its disks, otherwise reprotection and failback won't work.

主目标不能有半虚拟化 SCSI 控制器。The master target can't have a Paravirtual SCSI controller. 控制器只能是 LSI 逻辑控制器。The controller can only be an LSI Logic controller. 如果没有 LSI 逻辑控制器,重新保护会失败。Without an LSI Logic controller, reprotection fails.
故障回复复制策略Failback replication policy 若要复制回到本地站点,需要创建故障回复策略。To replicate back to on-premises site, you need a failback policy. 此策略是在创建目标为 Azure 的复制策略时自动创建的。This policy is automatically created when you create a replication policy to Azure.

此策略自动与配置服务器关联。The policy is automatically associated with the configuration server. 它将 RPO 阈值设置为 15 分钟,将恢复点保留期设置为 24 小时,将应用一致性快照频率设置为 60 分钟。It's set to an RPO threshold of 15 minutes, recovery point retention of 24 hours, and app-consistent snapshot frequency is 60 minutes. 无法编辑该策略。The policy can't be edited.
站点到站点 VPN/ExpressRoute 专用对等互连Site-to-site VPN/ExpressRoute private peering 重新保护和故障回复需要通过站点到站点 VPN 连接或 ExpressRoute 专用对等互连来复制数据。Reprotection and failback needs a site-to-site VPN connection, or ExpressRoute private peering to replicate data.

用于重新保护/故障回复的端口Ports for reprotection/failback

必须为重新保护/故障回复打开多个端口。A number of ports must be open for reprotection/failback. 下图演示了端口和重新保护/故障回复流。The following graphic illustrates the ports and reprotect/failback flow.

用于故障转移和故障回复的端口

在 Azure 中部署进程服务器Deploy a process server in Azure

  1. 在 Azure 中设置进程服务器以进行故障回复。Set up a process server in Azure for failback.
  2. 确保 Azure VM 可以访问进程服务器。Ensure that Azure VMs can reach the process server.
  3. 确保站点到站点 VPN 连接或 ExpressRoute 专用对等网络提供足够的带宽,可将数据从进程服务器发送到本地站点。Make sure that the site-to-site VPN connection or ExpressRoute private peering network has enough bandwidth to send data from the process server to the on-premises site.

部署单独的主目标服务器Deploy a separate master target server

  1. 请注意主目标服务器的要求和限制Note the master target server requirements and limitations.

  2. 创建 WindowsLinux 主目标服务器,以匹配要重新保护和故障回复的 VM 的操作系统。Create a Windows or Linux master target server, to match the operating system of the VMs you want to reprotect and fail back.

  3. 确保不要为主目标服务器使用存储 vMotion,否则故障回复可能会失败。Make sure you don't use Storage vMotion for the master target server, or failback can fail. VM 无法启动,因为磁盘不可供其使用。The VM machine can't start because the disks aren't available to it.

    • 若要防止出现此情况,请从 vMotion 列表中排除主目标服务器。To prevent this, exclude the master target server from your vMotion list.
    • 如果主目标在重新保护后执行存储 vMotion 任务,则附加到主目标服务器的受保护 VM 磁盘将迁移到 vMotion 任务的目标。If a master target undergoes a Storage vMotion task after reprotection, the protected VM disks attached to the master target server migrate to the target of the vMotion task. 如果尝试在此之后进行故障回复,则磁盘分离会因为找不到磁盘而失败。If you try to fail back after this, disk detachment fails because the disks aren't found. 此后,会难以在存储帐户中找到磁盘。It's then hard to find the disks in your storage accounts. 如果发生这种情况,请手动查找磁盘并将其附加到 VM。If this occurs, find them manually and attach them to the VM. 然后,即可启动本地 VM。After that, the on-premises VM can be booted.
  4. 向现有的 Windows 主目标服务器添加一个保留驱动器。Add a retention drive to the existing Windows master target server. 添加新磁盘并格式化驱动器。Add a new disk and format the drive. 保留驱动器用于停止 VM 复制回本地站点的时间点。The retention drive is used to stop the points in time when the VM replicates back to the on-premises site. 请注意以下条件。Note these criteria. 如果不满足这些条件,则无法为主目标服务器列出驱动器:If they aren't met, the drive isn't listed for the master target server:

    • 卷未用于任何其他目的(例如用作复制目标),并且不处于锁定模式。The volume isn't used for any other purpose, such as a replication target, and it isn't in lock mode.
    • 卷不是缓存卷。The volume isn't a cache volume. 用于进程服务器和主目标的自定义安装卷不能用作保留卷。The custom installation volume for the process server and master target isn't eligible for a retention volume. 当进程服务器和主目标安装在某个卷上时,该卷是主目标的缓存卷。When the process server and master target are installed on a volume, the volume is a cache volume of the master target.
    • 卷的文件系统类型不是 FAT 或 FAT32。The file system type of the volume isn't FAT or FAT32.
    • 卷容量为非零值。The volume capacity is nonzero.
    • Windows 的默认保留卷是 R 卷。The default retention volume for Windows is the R volume.
    • Linux 的默认保留卷是 /mnt/retention。The default retention volume for Linux is /mnt/retention.
  5. 如果使用现有的进程服务器,请添加驱动器。Add a drive if you're using an existing process server. 新驱动器必须满足上一步骤中所述的要求。The new drive must meet the requirements in the last step. 如果保留驱动器不存在,则它不会显示在门户上的选择下拉列表中。If the retention drive isn't present, it doesn't appear in the selection drop-down list on the portal. 将驱动器添加到本地主目标后,该驱动器最多将需要 15 分钟才会显示在门户上的选择项中。After you add a drive to the on-premises master target, it takes up to 15 minutes for the drive to appear in the selection on the portal. 如果 15 分钟后未显示该驱动器,可以刷新配置服务器。You can refresh the configuration server if the drive doesn't appear after 15 minutes.

  6. 在主目标服务器上安装 VMware 工具或 open-vm-tools。Install VMware tools or open-vm-tools on the master target server. 没有这些工具,将无法检测主目标的 ESXi 主机上的数据存储。Without the tools, the datastores on the master target's ESXi host can't be detected.

  7. 在 VMware 的主目标 VM 的配置参数中设置 disk.EnableUUID=true。Set the disk.EnableUUID=true setting in the configuration parameters of the master target VM in VMware. 如果此行不存在,请添加此行。If this row doesn't exist, add it. 若要为 VMDK 提供一致的 UUID,以便能够正确进行装载,则此设置是必需的。This setting is required to provide a consistent UUID to the VMDK so that it mounts correctly.

  8. 检查 vCenter Server 访问要求:Check vCenter Server access requirements:

    • 如果要故障回复到的 VM 位于 VMware vCenter Server 管理的 ESXi 主机上,则主目标服务器需要能够访问本地 VM 虚拟机磁盘 (VMDK) 文件,这样才能将复制的数据写入虚拟机的磁盘。If the VM to which you're failing back is on an ESXi host managed by VMware vCenter Server, the master target server needs access to the on-premises VM Virtual Machine Disk (VMDK) file, in order to write the replicated data to the virtual machine's disks. 确保在具有读写访问权限的主目标主机上装载本地 VM 数据存储。Make sure that the on-premises VM datastore is mounted on the master target host with read/write access.
    • 如果 VM 不是位于 VMware vCenter Server 管理的 ESXi 主机上,Site Recovery 将在重新保护期间创建新的 VM。If the VM isn't on an ESXi host managed by a VMware vCenter Server, Site Recovery creates a new VM during reprotection. 此 VM 的创建位置为创建主目标服务器 VM 所在的 ESXi 主机上。This VM is created on the ESXi host on which you create the master target server VM. 谨慎选择 ESXi 主机,以在所需的主机上创建 VM。Choose the ESXi host carefully, to create the VM on the host that you want. 此 VM 的硬盘必须位于可由运行主目标服务器的主机访问的数据存储中。The hard disk of the VM must be in a datastore that's accessible by the host on which the master target server is running.
    • 另一选择(如果故障回复的本地 VM 已存在)是在进行故障回复之前将其删除。Another option, if the on-premises VM already exists for failback, is to delete it before you do a failback. 故障回复之后会在与主目标 ESXi 主机相同的主机上创建新的 VM。Failback then creates a new VM on the same host as the master target ESXi host. 故障回复到备用位置时,会将数据恢复到与本地主目标服务器所用的相同数据存储和 ESXi 主机。When you fail back to an alternate location, the data is recovered to the same datastore and the same ESXi host as that used by the on-premises master target server.
  9. 对于故障回复到 VMware VM 的物理计算机,在重新保护此计算机之前,应对运行主目标服务器的主机完成发现。For physical machines failing back to VMware VMs, you should complete discovery of the host on which the master target server is running, before you can reprotect the machine.

  10. 检查主目标 VM 所在的 ESXi 主机是否至少附加了一个虚拟机文件系统 (VMFS) 数据存储。Check that the ESXi host on which the master target VM has at least one virtual machine file system (VMFS) datastore attached to it. 如果未附加任何 VMFS 数据存储,则重新保护设置中的数据存储输入为空,因此无法继续操作。If no VMFS datastores are attached, the datastore input in the reprotection settings is empty and you can't proceed.

后续步骤Next steps

重新保护 VM。Reprotect a VM.