排查 Azure 到 Azure VM 复制错误Troubleshoot Azure-to-Azure VM replication errors

本文介绍如何排查在 Azure Site Recovery 中将 Azure 虚拟机 (VM) 从一个区域复制和恢复到另一个区域期间出现的常见错误。This article describes how to troubleshoot common errors in Azure Site Recovery during replication and recovery of Azure virtual machines (VM) from one region to another. 有关支持的配置的详细信息,请参阅 support matrix for replicating Azure VMs(复制 Azure VM 的支持矩阵)。For more information about supported configurations, see the support matrix for replicating Azure VMs.

Azure 资源配额问题(错误代码 150097)Azure resource quota issues (error code 150097)

确保启用订阅,以在计划用作灾难恢复 (DR) 区域的目标区域中创建 Azure VM。Make sure your subscription is enabled to create Azure VMs in the target region that you plan to use as your disaster recovery (DR) region. 你的订阅需要有足够的配额来创建所需大小的 VM。Your subscription needs sufficient quota to create VMs of the necessary sizes. 默认情况下,Site Recovery 会选择与源 VM 大小相同的目标 VM 大小。By default, Site Recovery chooses a target VM size that's the same as the source VM size. 如果匹配的大小不可用,Site Recovery 会自动选择最接近的可用大小。If the matching size isn't available, Site Recovery automatically chooses the closest available size.

如果没有任何大小可支持源 VM 配置,将显示以下消息:If there's no size that supports the source VM configuration, the following message is displayed:

Replication couldn't be enabled for the virtual machine <VmName>.

可能的原因Possible causes

  • 你的订阅 ID 未启用,因此无法在目标区域位置创建任何 VM。Your subscription ID isn't enabled to create any VMs in the target region location.
  • 你的订阅 ID 未启用或没有足够的配额,因此无法在目标区域位置创建特定大小的 VM。Your subscription ID isn't enabled, or doesn't have sufficient quota, to create specific VM sizes in the target region location.
  • 对于目标区域位置中的订阅 ID,找不到与源 VM 网络接口卡 (NIC) 计数 (2) 匹配的适当目标 VM 大小。No suitable target VM size is found to match the source VM's network interface card (NIC) count (2), for the subscription ID in the target region location.

解决问题Fix the problem

联系 Azure 计费支持启用订阅,以便在目标位置中创建所需大小的 VM。Contact Azure billing support to enable your subscription to create VMs of the required sizes in the target location. 然后重试失败的操作。Then, retry the failed operation.

如果目标位置存在容量约束,请禁用到该位置的复制。If the target location has a capacity constraint, disable replication to that location. 然后在订阅具有足够配额的其他位置启用复制,以创建所需大小的 VM。Then, enable replication to a different location where your subscription has sufficient quota to create VMs of the required sizes.

受信任的根证书(错误代码 151066)Trusted root certificates (error code 151066)

如果 VM 上不存在所有最新的受信任根证书,则为 Site Recovery 启用复制的作业可能会失败。If not all the latest trusted root certificates are present on the VM, your job to enable replication for Site Recovery might fail. 如果没有这些证书,VM 发出的 Site Recovery 服务调用的身份验证和授权会失败。Authentication and authorization of Site Recovery service calls from the VM fail without these certificates.

如果“启用复制”作业失败,将显示以下消息:If the enable replication job fails, the following message is displayed:

Site Recovery configuration failed.

可能的原因Possible cause

虚拟机上不存在用于授权和身份验证的必需受信根证书。The trusted root certificates required for authorization and authentication aren't present on the virtual machine.

解决问题Fix the problem

WindowsWindows

对于运行 Windows 操作系统的 VM,请安装最新的 Windows 更新,以便所有受信任的根证书均存在于 VM 上。For a VM running the Windows operating system, install the latest Windows updates so that all the trusted root certificates are present on the VM. 按照组织中典型的 Windows 更新管理或证书更新管理过程进行操作,在 VM 上获取最新的根证书和更新的证书吊销列表。Follow the typical Windows update management or certificate update management process in your organization to get the latest root certificates and the updated certificate revocation list on the VMs.

  • 在断开连接的环境中,请按照组织中的标准 Windows 更新过程执行操作并获取证书。If you're in a disconnected environment, follow the standard Windows update process in your organization to get the certificates.
  • 如果 VM 上不存在所需证书,则对 Site Recovery 服务的调用会因安全原因而失败。If the required certificates aren't present on the VM, the calls to the Site Recovery service fail for security reasons.

要验证问题是否已解决,请在 VM 中使用浏览器转到 login.chinacloudapi.cnTo verify that the issue is resolved, go to login.chinacloudapi.cn from a browser in your VM.

有关详细信息,请参阅配置受信任根和不允许的证书For more information, see Configure trusted roots and disallowed certificates.

LinuxLinux

按照 Linux 操作系统版本分发商提供的指导,在 VM 上获取最新的受信任根证书和最新的证书吊销列表。Follow the guidance provided by the distributor of your Linux operating system version to get the latest trusted root certificates and the latest certificate revocation list on the VM.

由于 SUSE Linux 使用符号链接,若要维护证书列表,请按照以下步骤进行操作:Because SUSE Linux uses symbolic links, or symlinks, to maintain a certificate list, follow these steps:

  1. root 用户身份登录。Sign in as a root user. 哈希符号 (#) 是默认的命令提示符。The hash symbol (#) is the default command prompt.

  2. 若要更改目录,请运行以下命令:To change the directory, run this command:

    cd /etc/ssl/certs

  3. 检查 Symantec 根 CA 证书是否存在:Check whether the Symantec root CA certificate is present:

    ls VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem

    • 如果未找到 Symantec 根 CA 证书,请运行以下命令来下载该文件。If the Symantec root CA certificate isn't found, run the following command to download the file. 检查是否有任何错误,对于网络故障执行建议的操作。Check for any errors and follow recommended actions for network failures.

    wget https://docs.broadcom.com/docs-and-downloads/content/dam/symantec/docs/other-resources/verisign-class-3-public-primary-certification-authority-g5-en.pem -O VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem

  4. 检查 Baltimore 根 CA 证书是否存在:Check whether the Baltimore root CA certificate is present:

    ls Baltimore_CyberTrust_Root.pem

    • 如果未找到 Baltimore 根 CA 证书,请运行以下命令来下载该证书:If the Baltimore root CA certificate isn't found, run this command to download the certificate:

      wget https://www.digicert.com/CACerts/BaltimoreCyberTrustRoot.crt.pem -O Baltimore_CyberTrust_Root.pem

  5. 检查 DigiCert_Global_Root_CA 证书是否存在:Check whether the DigiCert_Global_Root_CA certificate is present:

    ls DigiCert_Global_Root_CA.pem

    • 如果未找到 DigiCert_Global_Root_CA,请运行以下命令来下载该证书:If the DigiCert_Global_Root_CA isn't found, run the following commands to download the certificate:

      wget http://www.digicert.com/CACerts/DigiCertGlobalRootCA.crt
      
      openssl x509 -in DigiCertGlobalRootCA.crt -inform der -outform pem -out DigiCert_Global_Root_CA.pem
      
  6. 若要为新下载的证书更新证书使用者哈希,请运行 rehash 脚本:To update the certificate subject hashes for the newly downloaded certificates, run the rehash script:

    c_rehash

  7. 若要检查是否为证书创建了使用者哈希作为符号链接,请运行以下命令:To check whether the subject hashes as symlinks were created for the certificates, run these commands:

    ls -l | grep Baltimore
    
    lrwxrwxrwx 1 root root   29 Jan  8 09:48 3ad48a91.0 -> Baltimore_CyberTrust_Root.pem
    
    -rw-r--r-- 1 root root 1303 Jun  5  2014 Baltimore_CyberTrust_Root.pem
    
    ls -l | grep VeriSign_Class_3_Public_Primary_Certification_Authority_G5
    
    -rw-r--r-- 1 root root 1774 Jun  5  2014 VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem
    
    lrwxrwxrwx 1 root root   62 Jan  8 09:48 facacbc6.0 -> VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem
    
    ls -l | grep DigiCert_Global_Root
    
    lrwxrwxrwx 1 root root   27 Jan  8 09:48 399e7759.0 -> DigiCert_Global_Root_CA.pem
    
    -rw-r--r-- 1 root root 1380 Jun  5  2014 DigiCert_Global_Root_CA.pem
    
  8. 使用文件名 b204d74a.0 创建文件 VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem 的副本 :Create a copy of the file VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem with filename b204d74a.0:

    cp VeriSign_Class_3_Public_Primary_Certification_Authority_G5.pem b204d74a.0

  9. 使用文件名 653b494a.0 创建文件 Baltimore_CyberTrust_Root.pem 的副本 :Create a copy of the file Baltimore_CyberTrust_Root.pem with filename 653b494a.0:

    cp Baltimore_CyberTrust_Root.pem 653b494a.0

  10. 使用文件名 3513523f.0 创建文件 DigiCert_Global_Root_CA.pem 的副本 :Create a copy of the file DigiCert_Global_Root_CA.pem with filename 3513523f.0:

    cp DigiCert_Global_Root_CA.pem 3513523f.0

  11. 检查这些文件是否存在:Check that the files are present:

    ls -l 653b494a.0 b204d74a.0 3513523f.0
    
    -rw-r--r-- 1 root root 1774 Jan  8 09:52 3513523f.0
    
    -rw-r--r-- 1 root root 1303 Jan  8 09:52 653b494a.0
    
    -rw-r--r-- 1 root root 1774 Jan  8 09:52 b204d74a.0
    

出站 URL 或 IP 范围(错误代码 151037 或 151072)Outbound URLs or IP ranges (error code 151037 or 151072)

要使 Site Recovery 复制正常工作,需要从 VM 到特定 URL 的出站连接。For Site Recovery replication to work, outbound connectivity to specific URLs is required from the VM. 如果 VM 位于防火墙后或使用网络安全组 (NSG) 规则来控制出站连接,则可能会遇到以下问题之一。If your VM is behind a firewall or uses network security group (NSG) rules to control outbound connectivity, you might face one of these issues. 虽然我们继续支持通过 URL 进行出站访问,但不再支持使用 IP 范围的允许列表。While we continue to support outbound access via URLs, using an allow list of IP ranges is no longer supported.

可能的原因Possible causes

  • 由于域名系统 (DNS) 解析失败,无法建立到 Site Recovery 终结点的连接。A connection can't be established to Site Recovery endpoints because of a Domain Name System (DNS) resolution failure.
  • 在重新保护期间,当你对虚拟机进行故障转移但无法从灾难恢复 (DR) 区域访问 DNS 服务器时,此问题较为常见。This problem is more common during reprotection when you have failed over the virtual machine but the DNS server isn't reachable from the disaster recovery (DR) region.

解决问题Fix the problem

如果使用的是自定义 DNS,请确保可以从灾难恢复区域访问 DNS 服务器。If you're using custom DNS, make sure that the DNS server is accessible from the disaster recovery region.

若要检查 VM 是否使用自定义 DNS 设置,请执行以下操作:To check if the VM uses a custom DNS setting:

  1. 打开“虚拟机”并选择 VM。Open Virtual machines and select the VM.

  2. 导航到 VM 的“设置”并选择“网络”。Navigate to the VMs Settings and select Networking.

  3. 在“虚拟网络/子网”中,选择相应链接以打开虚拟网络的资源页。In Virtual network/subnet, select the link to open the virtual network's resource page.

  4. 转到“设置”,然后选择“DNS 服务器” 。Go to Settings and select DNS servers.

    尝试从虚拟机访问 DNS 服务器。Try to access the DNS server from the virtual machine. 如果 DNS 服务器无法访问,请通过对 DNS 服务器进行故障转移或创建 DR 网络与 DNS 之间站点的行来使其可访问。If the DNS server isn't accessible, make it accessible by either failing over the DNS server or creating the line of site between DR network and DNS.

    com-error。

问题 2:Site Recovery 配置失败 (151196)Issue 2: Site Recovery configuration failed (151196)

可能的原因Possible cause

无法建立到 Office 365 身份验证和标识 IP4 终结点的连接。A connection can't be established to Office 365 authentication and identity IP4 endpoints.

解决问题Fix the problem

Azure Site Recovery 需要具有对 Office 365 IP 范围的访问权限才能进行身份验证。Azure Site Recovery required access to Office 365 IP ranges for authentication. 如果使用 Azure 网络安全组 (NSG) 规则/防火墙代理控制 VM 的出站网络连接,请确保使用基于 Azure Active Directory (AAD) 服务标记的 NSG 规则来允许访问 AAD。If you're using Azure Network Security Group (NSG) rules/firewall proxy to control outbound network connectivity on the VM, ensure that you use Azure Active Directory (AAD) service tag based NSG rule for allowing access to AAD. 我们不再支持基于 IP 地址的 NSG 规则。We no longer support IP address-based NSG rules.

问题 3:Site Recovery 配置失败 (151197)Issue 3: Site Recovery configuration failed (151197)

可能的原因Possible cause

无法建立到 Azure Site Recovery 服务终结点的连接。A connection can't be established to Azure Site Recovery service endpoints.

解决问题Fix the problem

如果使用 Azure 网络安全组 (NSG) 规则/防火墙代理来控制 VM 的出站网络连接,请确保使用服务标记。If you're using Azure Network Security Group (NSG) rules/firewall proxy to control outbound network connectivity on the VM, ensure that you use service tags. 我们不再支持通过 NSG 对 Azure Site Recovery 使用 IP 地址的允许列表。We no longer support using an allow list of IP addresses via NSGs for Azure Site Recovery.

问题 4:当网络流量使用本地代理服务器时复制失败 (151072)Issue 4: Replication fails when network traffic uses on-premises proxy server (151072)

可能的原因Possible cause

自定义代理设置无效,并且移动服务代理未在 Internet Explorer (IE) 中自动检测到代理设置。The custom proxy settings are invalid and the Mobility service agent didn't autodetect the proxy settings from Internet Explorer (IE).

解决问题Fix the problem

  1. 移动服务代理通过 Windows 上的 IE 和 Linux 上的 /etc/environment 检测代理设置。The Mobility service agent detects the proxy settings from IE on Windows and /etc/environment on Linux.

  2. 如果只想对移动服务设置代理,可在位于以下路径的 ProxyInfo.conf 中提供代理详细信息:If you prefer to set proxy only for the Mobility service, then you can provide the proxy details in ProxyInfo.conf located at:

    • Linux/usr/local/InMage/config/Linux: /usr/local/InMage/config/
    • WindowsC:\ProgramData\Microsoft Azure Site Recovery\ConfigWindows: C:\ProgramData\Microsoft Azure Site Recovery\Config
  3. ProxyInfo.conf 应包含采用以下 INI 格式的代理设置。The ProxyInfo.conf should have the proxy settings in the following INI format.

    [proxy]
    Address=http://1.2.3.4
    Port=567
    

备注

移动服务代理仅支持不进行身份验证的代理。The Mobility service agent only supports unauthenticated proxies.

详细信息More information

若要指定所需的 URL所需的 IP 范围,请遵循关于 Azure 到 Azure 复制中的网络中的指导。To specify the required URLs or the required IP ranges, follow the guidance in About networking in Azure to Azure replication.

在 VM 中找不到磁盘(错误代码 150039)Disk not found in VM (error code 150039)

必须对附加到 VM 的新磁盘进行初始化。A new disk attached to the VM must be initialized. 如果找不到该磁盘,将显示以下消息:If the disk isn't found, the following message is displayed:

Azure data disk <DiskName> <DiskURI> with logical unit number <LUN> <LUNValue> was not mapped to a corresponding disk being reported from within the VM that has the same LUN value.

可能的原因Possible causes

  • 新的数据磁盘已附加到 VM,但未初始化。A new data disk was attached to the VM but wasn't initialized.
  • VM 中的数据磁盘未正确报告附加到 VM 的磁盘的逻辑单元号 (LUN) 值。The data disk inside the VM isn't correctly reporting the logical unit number (LUN) value at which the disk was attached to the VM.

解决问题Fix the problem

确保数据磁盘已初始化,然后重试该操作。Make sure that the data disks are initialized, and then retry the operation.

如果问题仍然存在,请联系支持部门。If the problem persists, contact support.

有多个磁盘可用于保护(错误代码 153039)Multiple disks available for protection (error code 153039)

可能的原因Possible causes

  • 最近在保护后将一个或多个磁盘添加到虚拟机。One or more disks were recently added to the virtual machine after protection.
  • 在保护虚拟机之后初始化了一个或多个磁盘。One or more disks were initialized after protection of the virtual machine.

解决问题Fix the problem

若要使 VM 的复制状态再次恢复正常,可以选择保护磁盘或消除警告。To make the replication status of the VM healthy again, you can choose either to protect the disks or to dismiss the warning.

保护磁盘To protect the disks

  1. 转到“复制的项” > VM 名称 > “磁盘”。Go to Replicated Items > VM name > Disks.

  2. 选择未受保护的磁盘,然后选择“启用复制”:Select the unprotected disk, and then select Enable replication:

    在 VM 磁盘上启用复制。

消除警告To dismiss the warning

  1. 转到“复制的项” > VM 名称。Go to Replicated items > VM name.

  2. 选择“概述”部分选择警告,然后选择“确定”。 Select the warning in the Overview section, and then select OK.

    消除新磁盘警告。

从保管库中删除 VM 的操作已完成,但出现提示信息(错误代码 150225)VM removed from vault completed with information (error code 150225)

在保护虚拟机时,Site Recovery 会在源虚拟机上创建链接。When Site Recovery protects the virtual machine, it creates links on the source virtual machine. 去除保护或禁用复制时,Site Recovery 会在完成清理作业的过程中删除这些链接。When you remove the protection or disable replication, Site Recovery removes these links as a part of the cleanup job. 如果虚拟机存在资源锁,清理作业将会完成但会显示一些信息。If the virtual machine has a resource lock, the cleanup job gets completed with the information. 该信息指出,虚拟机已从恢复服务保管库中删除,但某些过期链接无法从源计算机中清理。The information says that the virtual machine has been removed from the Recovery Services vault, but that some of the stale links couldn't be cleaned up on the source machine.

如果你今后不打算再次保护此虚拟机,可以忽略此警告。You can ignore this warning if you never intend to protect this virtual machine again. 但是,如果你今后需要保护此虚拟机,请按照本部分中的步骤清理这些链接。But if you have to protect this virtual machine later, follow the steps in this section to clean up the links.

警告

如果不执行清理:If you don't do the cleanup:

  • 通过恢复服务保管库启用复制时,不会列出虚拟机。When you enable replication by means of the Recovery Services vault, the virtual machine won't be listed.
  • 如果尝试使用“虚拟机” > “设置” > “灾难恢复”来保护 VM,该操作会失败并出现消息“无法启用复制,因为 VM 上存在过期的资源链接”。 If you try to protect the VM by using Virtual machine > Settings > Disaster Recovery, the operation will fail with the message Replication cannot be enabled because of the existing stale resource links on the VM.

解决问题Fix the problem

备注

在执行以下步骤时,Site Recovery 不会删除源虚拟机或以任何方式影响它。Site Recovery doesn't delete the source virtual machine or affect it in any way while you perform these steps.

  1. 删除 VM 或 VM 资源组的锁。Remove the lock from the VM or VM resource group. 例如,在下图中,必须删除名为 MoveDemo 的 VM 上的资源锁:For example, in the following image, the resource lock on the VM named MoveDemo must be deleted:

    从 VM 中删除锁。

  2. 下载用于删除过时的 Site Recovery 配置的脚本。Download the script to remove a stale Site Recovery configuration.

  3. 运行脚本 Cleanup-stale-asr-config-Azure-VM.ps1Run the script, Cleanup-stale-asr-config-Azure-VM.ps1. 提供订阅 ID、VM 资源组和 VM 名称作为参数。 Provide the Subscription ID, VM Resource Group, and VM name as parameters.

  4. 如果系统提示你提供 Azure 凭据,请提供这些凭据。If you're prompted for Azure credentials, provide them. 然后验证该脚本是否正常运行,而不会出现任何失败。Then verify that the script runs without any failures.

未在具有陈旧资源的 VM 上启用复制(错误代码 150226)Replication not enabled on VM with stale resources (error code 150226)

可能的原因Possible causes

虚拟机上存在以前的 Site Recovery 保护中使用的过时配置。The virtual machine has a stale configuration from previous Site Recovery protection.

如果你使用 Site Recovery 为 Azure VM 启用了复制,然后执行了以下操作,则 Azure VM 上可能会出现过时的配置:A stale configuration can occur on an Azure VM if you enabled replication for the Azure VM by using Site Recovery, and then:

  • 禁用了复制,但源 VM 存在资源锁。You disabled replication, but the source VM had a resource lock.
  • 在未显式禁用 VM 上的复制的情况下删除了 Site Recovery 保管库。You deleted the Site Recovery vault without explicitly disabling replication on the VM.
  • 在未显式禁用 VM 上的复制的情况下删除了包含 Site Recovery 保管库的资源组。You deleted the resource group containing the Site Recovery vault without explicitly disabling replication on the VM.

解决问题Fix the problem

备注

在执行以下步骤时,Site Recovery 不会删除源虚拟机或以任何方式影响它。Site Recovery doesn't delete the source virtual machine or affect it in any way while you perform these steps.

  1. 删除 VM 或 VM 资源组的锁。Remove the lock from the VM or VM resource group. 例如,在下图中,必须删除名为 MoveDemo 的 VM 上的资源锁:For example, in the following image, the resource lock on the VM named MoveDemo must be deleted:

    从 VM 中删除锁。

  2. 下载用于删除过时的 Site Recovery 配置的脚本。Download the script to remove a stale Site Recovery configuration.

  3. 运行脚本 Cleanup-stale-asr-config-Azure-VM.ps1Run the script, Cleanup-stale-asr-config-Azure-VM.ps1. 提供订阅 ID、VM 资源组和 VM 名称作为参数。 Provide the Subscription ID, VM Resource Group, and VM name as parameters.

  4. 如果系统提示你提供 Azure 凭据,请提供这些凭据。If you're prompted for Azure credentials, provide them. 然后验证该脚本是否正常运行,而不会出现任何失败。Then verify that the script runs without any failures.

无法在“启用复制”作业中选择 VM 或资源组Can't select VM or resource group in enable replication job

问题 1:资源组和源 VM 位于不同的位置Issue 1: The resource group and source VM are in different locations

Site Recovery 当前要求源区域资源组和虚拟机应位于同一位置。Site Recovery currently requires the source region resource group and virtual machines to be in the same location. 否则,在尝试应用保护时将无法找到虚拟机或资源组。If they aren't, you won't be able to find the virtual machine or resource group when you try to apply protection.

一种解决方法是,从 VM 而不是从恢复服务保管库启用复制。As a workaround, you can enable replication from the VM instead of the Recovery Services vault. 转到“源 VM” > “属性” > “灾难恢复”并启用复制。 Go to Source VM > Properties > Disaster Recovery and enable the replication.

问题 2:资源组不是所选订阅的一部分Issue 2: The resource group isn't part of the selected subscription

如果资源组不是给定订阅的一部分,则可能无法在保护时找到该资源组。You might not be able to find the resource group at the time of protection if the resource group isn't part of the selected subscription. 确保该资源组属于正在使用的订阅。Make sure that the resource group belongs to the subscription that you're using.

问题 3:过时的配置Issue 3: Stale configuration

如果 Azure VM 上存在过时的 Site Recovery 配置,你可能看不到要为其启用复制的 VM。You might not see the VM that you want to enable for replication if a stale Site Recovery configuration exists on the Azure VM. 如果你使用 Site Recovery 为 Azure VM 启用了复制,然后执行了以下操作,则可能会出现这种情况:This condition could occur if you enabled replication for the Azure VM by using Site Recovery, and then:

  • 在未显式禁用 VM 上的复制的情况下删除了 Site Recovery 保管库。You deleted the Site Recovery vault without explicitly disabling replication on the VM.
  • 在未显式禁用 VM 上的复制的情况下删除了包含 Site Recovery 保管库的资源组。You deleted the resource group containing the Site Recovery vault without explicitly disabling replication on the VM.
  • 禁用了复制,但源 VM 存在资源锁。You disabled replication, but the source VM had a resource lock.

解决问题Fix the problem

备注

请确保在使用本部分所述的脚本之前更新 AzureRM.Resources 模块。Make sure to update the AzureRM.Resources module before using the script mentioned in this section. 在执行以下步骤时,Site Recovery 不会删除源虚拟机或以任何方式影响它。Site Recovery doesn't delete the source virtual machine or affect it in any way while you perform these steps.

  1. 删除 VM 或 VM 资源组中的锁(如果有)。Remove the lock, if any, from the VM or VM resource group. 例如,在下图中,必须删除名为 MoveDemo 的 VM 上的资源锁:For example, in the following image, the resource lock on the VM named MoveDemo must be deleted:

    从 VM 中删除锁。

  2. 下载用于删除过时的 Site Recovery 配置的脚本。Download the script to remove a stale Site Recovery configuration.

  3. 运行脚本 Cleanup-stale-asr-config-Azure-VM.ps1Run the script, Cleanup-stale-asr-config-Azure-VM.ps1. 提供订阅 ID、VM 资源组和 VM 名称作为参数。 Provide the Subscription ID, VM Resource Group, and VM name as parameters.

  4. 如果系统提示你提供 Azure 凭据,请提供这些凭据。If you're prompted for Azure credentials, provide them. 然后验证该脚本是否正常运行,而不会出现任何失败。Then verify that the script runs without any failures.

无法选择 VM 进行保护Unable to select a VM for protection

可能的原因Possible cause

虚拟机安装的扩展处于失败或无响应状态The virtual machine has an extension installed in a failed or unresponsive state

解决问题Fix the problem

转到“虚拟机” > “设置” > “扩展”,并检查是否有任何扩展处于失败状态。 Go to Virtual machines > Settings > Extensions and check for any extensions in a failed state. 卸载所有失败的扩展,然后重试保护虚拟机。Uninstall any failed extension, and then try again to protect the virtual machine.

VM 预配状态无效(错误代码 150019)VM provisioning state isn't valid (error code 150019)

若要在 VM 上启用复制,预配状态必须是“成功”。To enable replication on the VM, its provisioning state must be Succeeded. 遵循以下步骤检查预配状态:Follow these steps to check the provisioning state:

  1. 在 Azure 门户中,从“所有服务”中选择“资源浏览器”。 In the Azure portal, select the Resource Explorer from All Services.
  2. 展开“订阅”列表并选择你的订阅。Expand the Subscriptions list and select your subscription.
  3. 展开“资源组”并选择 VM 的资源组。Expand the ResourceGroups list and select the resource group of the VM.
  4. 展开“资源”列表并选择你的 VM。Expand the Resources list and select your VM.
  5. 在右侧的实例视图中检查“预配状态”字段。Check the provisioningState field in the instance view on the right side.

解决问题Fix the problem

  • 如果“预配状态”为“失败”,请联系支持人员并提供详细信息以排除故障。 If the provisioningState is Failed, contact support with details to troubleshoot.
  • 如果“预配状态”为“正在更新”,则可能是正在部署另一扩展。 If the provisioningState is Updating, another extension might be being deployed. 检查 VM 上是否有任何正在进行的操作,等待它们完成,然后重试失败的 Site Recovery 作业来启用复制。Check whether there are any ongoing operations on the VM, wait for them to finish, and then retry the failed Site Recovery job to enable replication.

无法选择目标 VMUnable to select target VM

问题 1:VM 附加到了已映射至目标网络的网络Issue 1: VM is attached to a network that's already mapped to a target network

在配置灾难恢复期间,如果源 VM 在某个虚拟网络中,并且同一虚拟网络中的另一个 VM 已映射到目标资源组中的某个网络,则网络选择下拉列表框默认将不可用(灰显)。During disaster recovery configuration, if the source VM is part of a virtual network, and another VM from the same virtual network is already mapped with a network in the target resource group, the network selection drop-down list box is unavailable (appears dimmed) by default.

网络选择列表不可用。

问题 2:你之前已保护了 VM,然后禁用了复制Issue 2: You previously protected the VM and then you disabled the replication

禁用 VM 复制不会删除网络映射。Disabling replication of a VM doesn't delete the network mapping. 必须从保护 VM 的恢复服务保管库中删除映射。The mapping must be deleted from the Recovery Services vault where the VM was protected. 选择“恢复服务保管库”,然后转到“管理” > “Site Recovery 基础结构” > “针对 Azure 虚拟机” > “网络映射”。 Select the Recovery Services vault and go to Manage > Site Recovery Infrastructure > For Azure virtual machines > Network Mapping.

删除网络映射。

可以在完成初始设置并保护 VM 之后更改在灾难恢复设置期间配置的目标网络。The target network that was configured during the disaster recovery setup can be changed after the initial setup, and after the VM is protected. 若要修改网络映射,请选择网络名称:To Modify network mapping select the network name:

修改网络映射。

COM+ 或 VSS(错误代码 151025)COM+ or VSS (error code 151025)

发生 COM+ 或卷影复制服务 (VSS) 错误时,将显示以下消息:When the COM+ or Volume Shadow Copy Service (VSS) error occurs, the following message is displayed:

Site Recovery extension failed to install.

可能的原因Possible causes

  • 禁用了 COM+ 系统应用程序服务。The COM+ System Application service is disabled.
  • 禁用了卷影复制服务。The Volume Shadow Copy Service is disabled.

解决问题Fix the problem

将 COM+ 系统应用程序和卷影复制服务设置为自动或手动启动模式。Set the COM+ System Application and Volume Shadow Copy Service to automatic or manual startup mode.

  1. 在 Windows 中打开“服务”控制台。Open the Services console in Windows.

  2. 确保 COM+ 系统应用程序和卷影复制服务的“启动类型”未设置为“已禁用”。 Make sure the COM+ System Application and Volume Shadow Copy Service aren't set to Disabled as their Startup Type.

    检查 COM+ 系统应用程序和卷影复制服务的启动类型。

不支持的托管磁盘大小(错误代码 150172)Unsupported managed-disk size (error code 150172)

发生此错误时,将显示以下消息:When this error occurs, the following message is displayed:

Protection couldn't be enabled for the virtual machine as it has <DiskName> with size <DiskSize> that is lesser than the minimum supported size 1024 MB.

可能的原因Possible cause

磁盘小于支持的大小 (1024 MB)。The disk is smaller than the supported size of 1024 MB.

解决问题Fix the problem

确保磁盘大小在支持的大小范围内,然后重试该操作。Make sure that the disk size is within the supported size range, and then retry the operation.

当 GRUB 使用设备名称时未启用保护(错误代码 151126)Protection not enabled when GRUB uses device name (error code 151126)

可能的原因Possible causes

Linux Grand Unified Bootloader (GRUB) 配置文件(“/boot/grub/menu.lst”、“/boot/grub/grub.cfg”、“/boot/grub2/grub.cfg”或“/etc/default/grub”)可能为 rootresume 参数指定了实际设备名而非全局唯一标识符 (UUID) 值。 The Linux Grand Unified Bootloader (GRUB) configuration files (/boot/grub/menu.lst, /boot/grub/grub.cfg, /boot/grub2/grub.cfg, or /etc/default/grub) might specify the actual device names instead of universally unique identifier (UUID) values for the root and resume parameters. Site Recovery 需要 UUID,因为设备名称可能会更改。Site Recovery requires UUIDs because device names can change. 重启后,VM 在故障转移时可能不会使用相同的名称,从而导致出现问题。Upon restart, a VM might not come up with the same name on failover, resulting in problems.

以下示例摘自 GRUB 文件的代码行,其中显示了设备名称而不是所需的 UUID:The following examples are lines from GRUB files where device names appear instead of the required UUIDs:

  • 文件 /boot/grub2/grub.cfgFile /boot/grub2/grub.cfg:

    linux /boot/vmlinuz-3.12.49-11-default root=/dev/sda2 ${extra_cmdline} resume=/dev/sda1 splash=silent quiet showopts

  • 文件: /boot/grub/menu.lstFile: /boot/grub/menu.lst

    kernel /boot/vmlinuz-3.0.101-63-default root=/dev/sda2 resume=/dev/sda1 splash=silent crashkernel=256M-:128M showopts vga=0x314

解决问题Fix the problem

将每个设备名称替换为相应的 UUID:Replace each device name with the corresponding UUID:

  1. 执行 blkid <device name> 命令来查找设备的 UUID。Find the UUID of the device by executing the command blkid <device name>. 例如:For example:

    blkid /dev/sda1
    /dev/sda1: UUID="6f614b44-433b-431b-9ca1-4dd2f6f74f6b" TYPE="swap"
    blkid /dev/sda2
    /dev/sda2: UUID="62927e85-f7ba-40bc-9993-cc1feeb191e4" TYPE="ext3"
    
  2. 请将设备名称替换为其 UUID,采用 root=UUID=<UUID>resume=UUID=<UUID> 格式。Replace the device name with its UUID, in the formats root=UUID=<UUID> and resume=UUID=<UUID>. 例如,在替换后,/boot/grub/menu.lst 中的行将如以下行所示:For example, after replacement, the line from /boot/grub/menu.lst would look like the following line:

    kernel /boot/vmlinuz-3.0.101-63-default root=UUID=62927e85-f7ba-40bc-9993-cc1feeb191e4 resume=UUID=6f614b44-433b-431b-9ca1-4dd2f6f74f6b splash=silent crashkernel=256M-:128M showopts vga=0x314

  3. 重试保护。Retry the protection.

由于不存在 GRUB 设备,保护失败(错误代码 151124)Protection failed because GRUB device doesn't exist (error code 151124)

可能的原因Possible cause

GRUB 配置文件(/boot/grub/menu.lst、/boot/grub/grub.cfg、/boot/grub2/grub.cfg 或 /etc/default/grub)可能包含参数 rd.lvm.lvrd_LVM_LVThe GRUB configuration files (/boot/grub/menu.lst, /boot/grub/grub.cfg, /boot/grub2/grub.cfg, or /etc/default/grub) might contain the parameters rd.lvm.lv or rd_LVM_LV. 这些参数指定了在启动时要发现的逻辑卷管理器 (LVM) 设备。These parameters identify the Logical Volume Manager (LVM) devices that are to be discovered at boot time. 如果这些 LVM 设备不存在,则受保护的系统本身不会启动,而是停滞在启动过程。If these LVM devices don't exist, the protected system itself won't boot and will be stuck in the boot process. 故障转移 VM 上也会出现相同的问题。The same problem will also be seen with the failover VM. 以下是几个示例:Here are few examples:

  • RHEL7 上的 /boot/grub2/grub.cfg 文件:File: /boot/grub2/grub.cfg on RHEL7:

    linux16 /vmlinuz-3.10.0-957.el7.x86_64 root=/dev/mapper/rhel_mup--rhel7u6-root ro crashkernel=128M\@64M rd.lvm.lv=rootvg/root rd.lvm.lv=rootvg/swap rhgb quiet LANG=en_US.UTF-8

  • RHEL7 上的 /etc/default/grub 文件:File: /etc/default/grub on RHEL7:

    GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rootvg/root rd.lvm.lv=rootvg/swap rhgb quiet

  • RHEL6 上的 /boot/grub/menu.lst 文件:File: /boot/grub/menu.lst on RHEL6:

    kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=UUID=36dd8b45-e90d-40d6-81ac-ad0d0725d69e rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=rootvg/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=rootvg/lv_swap rd_NO_DM rhgb quiet

在每个示例中,GRUB 都必须检测卷组 rootvg 中名为 rootswap 的两个 LVM 设备。In each example, GRUB has to detect two LVM devices with the names root and swap from the volume group rootvg.

解决问题Fix the problem

如果 LVM 设备不存在,请创建该设备,或者从 GRUB 配置文件中删除该设备对应的参数。If the LVM device doesn't exist, either create it or remove the corresponding parameters from the GRUB configuration files. 然后重试启用保护。Then, try again to enable protection.

移动服务更新已完成但出现警告(错误代码 151083)Mobility service update finished with warnings (error code 151083)

Site Recovery 移动服务有多个组件,其中一个称为筛选器驱动程序。The Site Recovery Mobility service has many components, one of which is called the filter driver. 筛选器驱动程序只有在系统重启期间才会加载到系统内存中。The filter driver is loaded into system memory only during system restart. 每当移动服务更新涉及到筛选器驱动程序的更改时,计算机才会更新,但仍会出现警告,指出某些修复措施需要重启。Whenever a Mobility service update includes filter driver changes, the machine is updated but you still see a warning that some fixes require a restart. 之所以出现该警告,是因为仅当已加载新的筛选器驱动程序(只会在重启期间发生)时,筛选器驱动程序修复措施才会生效。The warning appears because the filter driver fixes can take effect only when the new filter driver is loaded, which happens only during a restart.

备注

这只是一条警告。This is only a warning. 即使在新代理更新后,现有的复制也能继续正常工作。The existing replication continues to work even after the new agent update. 可以选择每当需要获得新筛选器驱动程序的优势时才重启,但如果不重启,旧的筛选器驱动程序也能保持正常工作。You can choose to restart whenever you want the benefits of the new filter driver, but the old filter driver keeps working if you don't restart.

除了筛选器驱动程序以外,无需重启,移动服务更新中其他任何增强和修复的优势也能生效。Apart from the filter driver, the benefits of any other enhancements and fixes in the Mobility service update take effect without requiring a restart.

如果存在副本托管磁盘,则不会启用保护Protection not enabled if replica managed disk exists

当副本托管磁盘已存在,但目标资源组中不包含预期的标记时,会发生此错误。This error occurs when the replica managed disk already exists, without expected tags, in the target resource group.

可能的原因Possible cause

如果虚拟机过去受保护,但禁用复制时未删除副本磁盘,则可能会出现此问题。This problem can occur if the virtual machine was previously protected, and when replication was disabled, the replica disk wasn't removed.

解决问题Fix the problem

删除错误消息中指出的副本磁盘,然后重试失败的保护作业。Delete the replica disk identified in the error message and retry the failed protection job.

后续步骤Next steps

将 Azure VM 复制到另一个 Azure 区域Replicate Azure VMs to another Azure region