排查配置服务器问题Troubleshoot configuration server issues

本文可帮助你排查在部署和管理 Azure Site Recovery 配置服务器时遇到的问题。This article helps you troubleshoot issues when you deploy and manage the Azure Site Recovery configuration server. 配置服务器充当管理服务器。The configuration server acts as a management server. 使用配置服务器可以设置通过 Site Recovery 将本地 VMware VM 和物理服务器灾难恢复到 Azure。Use the configuration server to set up disaster recovery of on-premises VMware VMs and physical servers to Azure by using Site Recovery. 以下部分将介绍在添加新的配置服务器和管理配置服务器时可能会遇到的最常见失败。The following sections discuss the most common failures you might experience when you add a new configuration server and when you manage a configuration server.

注册失败Registration failures

安装移动代理时,源计算机将注册到配置服务器。The source machine registers with the configuration server when you install the mobility agent. 可遵循以下指导原则调试执行此步骤期间发生的任何失败:You can debug any failures during this step by following these guidelines:

  1. 打开 C:\ProgramData\ASR\home\svsystems\var\configurator_register_host_static_info.log 文件。Open the C:\ProgramData\ASR\home\svsystems\var\configurator_register_host_static_info.log file. (ProgramData 文件夹可能已隐藏。(The ProgramData folder might be a hidden folder. 如果未看到 ProgramData 文件夹,请在文件资源管理器的“查看”选项卡上的“显示/隐藏”部分,选中“隐藏的项目”复选框)。 失败可能是多个问题造成的。If you don't see the ProgramData folder, in File Explorer, on the View tab, in the Show/hide section, select the Hidden items check box.) Failures might be caused by multiple issues.

  2. 搜索字符串 No Valid IP Address foundSearch for the string No Valid IP Address found. 如果找到了该字符串:If the string is found:

    1. 验证请求的主机 ID 是否与源计算机的主机 ID 相同。Verify that the requested host ID is the same as the host ID of the source machine.
    2. 验证是否为源计算机的物理 NIC 至少分配了一个 IP 地址。Verify that the source machine has at least one IP address assigned to the physical NIC. 若要将代理成功注册到配置服务器,必须为源计算机的物理 NIC 至少分配一个有效的 IPV4 地址。For agent registration with the configuration server to succeed, the source machine must have at least one valid IP v4 address assigned to the physical NIC.
    3. 在源计算机上运行以下命令以获取源计算机的所有 IP 地址:Run one of the following commands on the source machine to get all the IP addresses of the source machine:
      • 对于 Windows:> ipconfig /allFor Windows: > ipconfig /all
      • 对于 Linux:# ifconfig -aFor Linux: # ifconfig -a
  3. 如果找不到字符串 No Valid IP Address found ,请搜索字符串 Reason=>NULLIf the string No Valid IP Address found isn't found, search for the string Reason=>NULL. 如果源计算机使用空主机注册到配置服务器,则会发生此错误。This error occurs if the source machine uses an empty host to register with the configuration server. 如果找到了该字符串:If the string is found:

  4. 如果找不到字符串 Reason=>NULL ,请在源计算机上打开 C:\ProgramData\ASRSetupLogs\UploadedLogs\ASRUnifiedAgentInstaller.log 文件。If the string Reason=>NULL isn't found, on the source machine, open the C:\ProgramData\ASRSetupLogs\UploadedLogs\ASRUnifiedAgentInstaller.log file. (ProgramData 文件夹可能已隐藏。(The ProgramData folder might be a hidden folder. 如果未看到 ProgramData 文件夹,请在文件资源管理器的“查看”选项卡上的“显示/隐藏”部分,选中“隐藏的项目”复选框)。 失败可能是多个问题造成的。If you don't see the ProgramData folder, in File Explorer, on the View tab, in the Show/hide section, select the Hidden items check box.) Failures might be caused by multiple issues.

  5. 搜索字符串 post request:(7) - Couldn't connect to serverSearch for the string post request: (7) - Couldn't connect to server. 如果找到了该字符串:If the string is found:

    1. 解决源计算机与配置服务器之间的网络问题。Resolve the network issues between the source machine and the configuration server. 验证是否可以使用 ping、traceroute 或 Web 浏览器等网络工具从源计算机访问配置服务器。Verify that the configuration server is reachable from the source machine by using network tools like ping, traceroute, or a web browser. 确保源计算机可以通过端口 443 访问配置服务器。Ensure that the source machine can reach the configuration server through port 443.
    2. 检查源计算机上是否有任何防火墙规则正在阻止源计算机与配置服务器之间的连接。Check whether any firewall rules on the source machine block the connection between the source machine and the configuration server. 咨询网络管理员来消除任何连接问题。Work with your network admins to unblock any connection issues.
    3. 确保从防病毒软件中排除要从防病毒程序中排除的 Site Recovery 文件夹中列出的文件夹。Ensure that the folders listed in Site Recovery folder exclusions from antivirus programs are excluded from the antivirus software.
    4. 解决网络问题后,遵照将源计算机注册到配置服务器中的指导原则重试注册。When network issues are resolved, retry the registration by following the guidelines in Register the source machine with the configuration server.
  6. 如果未找到字符串 post request:(7) - Couldn't connect to server ,请在同一日志文件中查找字符串 request:(60) - Peer certificate cannot be authenticated with given CA certificatesIf the string post request: (7) - Couldn't connect to server isn't found, in the same log file, look for the string request: (60) - Peer certificate cannot be authenticated with given CA certificates. 如果配置服务器证书已过期,或者源计算机不支持 TLS 1.0 或更高版本的协议,则可能会发生此错误。This error might occur because the configuration server certificate has expired or the source machine doesn't support TLS 1.0 or later protocols. 如果防火墙阻止了源计算机与配置服务器之间的 TLS 通信,则也可能会发生此错误。It also might occur if a firewall blocks TLS communication between the source machine and the configuration server. 如果找到了该字符串:If the string is found:

    1. 若要解决此问题,请在源计算机上使用 Web 浏览器连接到配置服务器 IP 地址。To resolve, connect to the configuration server IP address by using a web browser on the source machine. 请使用 URI https://<配置服务器 IP 地址>:443/。Use the URI https://<configuration server IP address>:443/. 确保源计算机可以通过端口 443 访问配置服务器。Ensure that the source machine can reach the configuration server through port 443.
    2. 检查是否需要在源计算机上添加或删除任何防火墙规则,使源计算机能够与配置服务器通信。Check whether any firewall rules on the source machine need to be added or removed for the source machine to talk to the configuration server. 由于使用的防火墙软件多种多样,我们无法列出全部所需的防火墙配置。Because of the variety of firewall software that might be in use, we can't list all required firewall configurations. 咨询网络管理员来消除任何连接问题。Work with your network admins to unblock any connection issues.
    3. 确保从防病毒软件中排除要从防病毒程序中排除的 Site Recovery 文件夹中列出的文件夹。Ensure that the folders listed in Site Recovery folder exclusions from antivirus programs are excluded from the antivirus software.
    4. 解决问题后,遵照将源计算机注册到配置服务器中的指导原则重试注册。After you resolve the issues, retry the registration by following guidelines in Register the source machine with the configuration server.
  7. 在 Linux 上,如果 <INSTALLATION_DIR>/etc/drscout.conf 中的平台值已损坏,则注册将会失败。On Linux, if the value of the platform in <INSTALLATION_DIR>/etc/drscout.conf is corrupted, registration fails. 若要识别此问题,请打开 /var/log/ua_install.log 文件。To identify this issue, open the /var/log/ua_install.log file. 搜索字符串 Aborting configuration as VM_PLATFORM value is either null or it is not VmWare/AzureSearch for the string Aborting configuration as VM_PLATFORM value is either null or it is not VmWare/Azure. 平台应设置为 VmWareAzureThe platform should be set to either VmWare or Azure. 如果 drscout.conf 文件已损坏,我们建议卸载移动代理,然后重新安装移动代理。If the drscout.conf file is corrupted, we recommend that you uninstall the mobility agent and then reinstall the mobility agent. 如果卸载失败,请完成以下步骤:a.If uninstallation fails, complete the following steps: a. 打开 Installation_Directory/uninstall.sh 文件,并注释掉对 StopServices 函数的调用。Open the Installation_Directory/uninstall.sh file and comment out the call to the StopServices function. b.b. 打开 Installation_Directory/Vx/bin/uninstall.sh 文件,并注释掉对 stop_services 函数的调用。Open the Installation_Directory/Vx/bin/uninstall.sh file and comment out the call to the stop_services function. c.c. 打开 Installation_Directory/Fx/uninstall.sh 文件,并注释掉尝试停止 Fx 服务的整个节。Open the Installation_Directory/Fx/uninstall.sh file and comment out the entire section that's trying to stop the Fx service. d.d. 卸载移动服务。Uninstall the mobility agent. 成功卸载后,重新启动系统,然后尝试安装移动代理。After successful uninstallation, reboot the system, and then try to reinstall the mobility agent.

  8. 确保没有为用户帐户启用多重身份验证。Ensure that multi-factor authentication is not enabled for user account. 目前,Azure Site Recovery 不支持对用户帐户进行多重身份验证。Azure Site Recovery does not support multi-factor authentication for user account as of now. 注册没有启用多重身份验证的用户帐户的配置服务器。Register the configuration server without multi-factor authentication enabled user account.

安装失败:无法加载帐户Installation failure: Failed to load accounts

如果服务在安装移动代理和注册到配置服务器时无法通过传输连接读取数据,则会发生此错误。This error occurs when the service can't read data from the transport connection when it's installing the mobility agent and registering with the configuration server. 若要解决此问题,请确保在源计算机上启用 TLS 1.0。To resolve the issue, ensure that TLS 1.0 is enabled on your source machine.

vCenter 发现失败vCenter discovery failures

若要解决 vCenter 发现失败问题,请向 byPass 列表代理设置添加 vCenter 服务器。To resolve vCenter discovery failures, add the vCenter server to the byPass list proxy settings.

  • 此处下载 PsExec 工具来访问系统用户内容。Download PsExec tool from here to access System user content.
  • 通过运行以下命令行在系统用户内容中打开 Internet Explorer:psexec -s -i "%programfiles%\Internet Explorer\iexplore.exe"Open Internet Explorer in system user content by running the following command line psexec -s -i "%programfiles%\Internet Explorer\iexplore.exe"
  • 在 IE 中添加代理设置并重启 tmanssvc 服务。Add proxy settings in IE and restart tmanssvc service.
  • 若要配置 DRA 代理设置,请运行 cd C:\Program Files\Microsoft Azure Site Recovery ProviderTo configure DRA proxy settings, run cd C:\Program Files\Microsoft Azure Site Recovery Provider
  • 接下来,执行 DRCONFIGURATOR.EXE /configure /AddBypassUrls [添加在 配置服务器部署配置 vCenter 服务器/vSphere ESXi 服务器 步骤中提供的 vCenter 服务器 IP 地址/FQDN]Next, execute DRCONFIGURATOR.EXE /configure /AddBypassUrls [add IP Address/FQDN of vCenter Server provided during Configure vCenter Server/vSphere ESXi server step of Configuration Server deployment]

更改配置服务器的 IP 地址Change the IP address of the configuration server

我们强烈建议不要更改配置服务器的 IP 地址。We strongly recommend that you don't change the IP address of a configuration server. 确保分配给配置服务器的所有 IP 是静态 IP 地址。Ensure that all IP addresses that are assigned to the configuration server are static IP addresses. 不要使用 DHCP IP 地址。Don't use DHCP IP addresses.

ACS50008:SAML 令牌无效ACS50008: SAML token is invalid

若要避免此错误,请确保系统时钟上的时间与本地时间之间的偏差不超过 15 分钟。To avoid this error, ensure that the time on your system clock isn't different from the local time by more than 15 minutes. 重新运行安装程序以完成注册。Rerun the installer to complete the registration.

无法创建证书Failed to create a certificate

无法创建用于在 Site Recovery 中进行身份验证的证书。A certificate that's required to authenticate Site Recovery can't be created. 确保以本地管理员的身份运行安装程序后,重新运行安装程序。Rerun setup after you ensure that you're running setup as a local administrator.

未能激活从服务器标准评估版到服务器标准版的 Windows 许可证Failure to activate Windows License from Server Standard EVALUATION to Server Standard

  1. 通过 OVF 部署配置服务器的过程中,使用了评估许可证,该许可证的有效期为 180 天。As part of Configuration server deployment through OVF, an evaluation license is used, which is valid for 180 days. 需要在此许可证过期之前进行激活。You need to activate this License before this gets expired. 否则,这可能导致配置服务器频繁关闭,因而妨碍复制活动。Else, this can result in frequent shutdown of configuration server and thus cause hindrance to replication activities.
  2. 如果无法激活 Windows 许可证,请联系 Windows 支持团队以解决此问题。If you are unable to activate Windows license, reach out to Windows support team to resolve the issue.

将源计算机注册到配置服务器Register source machine with configuration server

如果源计算机运行 WindowsIf the source machine runs Windows

在源计算机上运行以下命令:Run the following command on the source machine:

    cd C:\Program Files (x86)\Microsoft Azure Site Recovery\agent
    UnifiedAgentConfigurator.exe  /CSEndPoint <configuration server IP address> /PassphraseFilePath <passphrase file path>
设置Setting 详细信息Details
使用情况Usage UnifiedAgentConfigurator.exe /CSEndPoint <配置服务器 IP 地址> /PassphraseFilePath <通行短语文件路径>UnifiedAgentConfigurator.exe /CSEndPoint <configuration server IP address> /PassphraseFilePath <passphrase file path>
代理配置日志Agent configuration logs 位于 %ProgramData%\ASRSetupLogs\ASRUnifiedAgentConfigurator.log 下。Located under %ProgramData%\ASRSetupLogs\ASRUnifiedAgentConfigurator.log.
/CSEndPoint/CSEndPoint 必需的参数。Mandatory parameter. 指定配置服务器的 IP 地址。Specifies the IP address of the configuration server. 使用任何有效的 IP 地址。Use any valid IP address.
/PassphraseFilePath/PassphraseFilePath 必需。Mandatory. 通行短语的位置。The location of the passphrase. 使用任何有效的 UNC 或本地文件路径。Use any valid UNC or local file path.

如果源计算机运行 LinuxIf the source machine runs Linux

在源计算机上运行以下命令:Run the following command on the source machine:

    /usr/local/ASR/Vx/bin/UnifiedAgentConfigurator.sh -i <configuration server IP address> -P /var/passphrase.txt
设置Setting 详细信息Details
使用情况Usage cd /usr/local/ASR/Vx/bincd /usr/local/ASR/Vx/bin

UnifiedAgentConfigurator.sh -i <配置服务器 IP 地址> -P <通行短语文件路径>UnifiedAgentConfigurator.sh -i <configuration server IP address> -P <passphrase file path>
-i-i 必需的参数。Mandatory parameter. 指定配置服务器的 IP 地址。Specifies the IP address of the configuration server. 使用任何有效的 IP 地址。Use any valid IP address.
-p-P 必需。Mandatory. 通行短语所保存到的文件的完整文件路径。The full file path of the file in which the passphrase is saved. 使用任何有效文件夹。Use any valid folder.

无法配置配置服务器Unable to configure the configuration server

如果在虚拟机上安装配置服务器以外的应用程序,则可能会无法配置主目标。If you install applications other than the configuration server on the virtual machine, you might be unable to configure the master target.

配置服务器必须是单一用途服务器,并且不支持将其用作共享服务器。The configuration server must be a single purpose server and using it as a shared server is unsupported.

有关详细信息,请参阅部署配置服务器中的配置常见问题解答。For more information, see the configuration FAQ in Deploy a configuration server.

从配置服务器数据库中删除受保护项的过时条目Remove the stale entries for protected items from the configuration server database

若要删除配置服务器上过时的受保护计算机,请使用以下步骤。To remove stale protected machine on the configuration server, use the following steps.

  1. 确定过时条目的源计算机和 IP 地址:To determine the source machine and IP address of the stale entry:

    1. 在管理员模式下打开 MYSQL 命令行。Open the MYSQL cmdline in administrator mode.

    2. 执行以下命令。Execute the following commands.

      mysql> use svsdb1;
      mysql> select id as hostid, name, ipaddress, ostype as operatingsystem, from_unixtime(lasthostupdatetime) as heartbeat from hosts where name!='InMageProfiler'\G;
      

      这将返回已注册计算机的列表及其 IP 地址和上次检测信号。This returns the list of registered machines along with their IP addresses and last heart beat. 查找具有过时复制对的主机。Find the host that has stale replication pairs.

  2. 打开提升的命令提示符并导航到 C:\ProgramData\ASR\home\svsystems\bin。Open an elevated command prompt and navigate to C:\ProgramData\ASR\home\svsystems\bin.

  3. 若要从配置服务器删除已注册主机详细信息和过时条目信息,请使用过时条目的源计算机和 IP 地址运行以下命令。To remove the registered hosts details and the stale entry information from the configuration server, run the following command using the source machine and the IP address of the stale entry.

    Syntax: Unregister-ASRComponent.pl -IPAddress <IP_ADDRESS_OF_MACHINE_TO_UNREGISTER> -Component <Source/ PS / MT>

    如果源服务器条目为“OnPrem-VM01”且 ip-address 为 10.0.0.4,则改为使用以下命令。If you have a source server entry of "OnPrem-VM01" with an ip-address of 10.0.0.4 then use the following command instead.

    perl Unregister-ASRComponent.pl -IPAddress 10.0.0.4 -Component Source

  4. 在源计算机上重启以下服务,向配置服务器重新注册。Restart the following services on source machine to reregister with the configuration server.

    • InMage Scout 应用程序服务InMage Scout Application Service
    • InMage Scout VX Agent - Sentinel/OutpostInMage Scout VX Agent - Sentinel/Outpost

服务无法停止时升级失败Upgrade fails when the services fail to stop

如果特定服务无法停止,则配置服务器升级失败。The configuration server upgrade fails when certain services do not stop.

若要确定问题,请导航到配置服务器上的 C:\ProgramData\ASRSetupLogs\CX_TP_InstallLogFile。To identify the issue, navigate to C:\ProgramData\ASRSetupLogs\CX_TP_InstallLogFile on the configuration server. 如果发现以下错误,请使用以下步骤解决问题:If you find following errors, use the steps below to resolve the issue:

2018-06-28 14:28:12.943   Successfully copied php.ini to C:\Temp from C:\thirdparty\php5nts
2018-06-28 14:28:12.943   svagents service status - SERVICE_RUNNING
2018-06-28 14:28:12.944   Stopping svagents service.
2018-06-28 14:31:32.949   Unable to stop svagents service.
2018-06-28 14:31:32.949   Stopping svagents service.
2018-06-28 14:34:52.960   Unable to stop svagents service.
2018-06-28 14:34:52.960   Stopping svagents service.
2018-06-28 14:38:12.971   Unable to stop svagents service.
2018-06-28 14:38:12.971   Rolling back the install changes.
2018-06-28 14:38:12.971   Upgrade has failed.

若要解决问题,请执行以下操作:To resolve the issue:

手动停止以下服务:Manually stop the following services:

  • cxprocessservercxprocessserver
  • InMage Scout VX Agent - Sentinel/Outpost、InMage Scout VX Agent - Sentinel/Outpost,
  • Azure 恢复服务代理、Azure Recovery Services Agent,
  • Azure Site Recovery 服务、Azure Site Recovery Service,
  • tmansvctmansvc

若要更新配置服务器,请再次运行统一安装程序To update the configuration server, run the unified setup again.

Azure Active Directory 应用程序创建失败Azure Active Directory application creation failure

没有在 Azure Active Directory (AAD) 中使用开放虚拟化应用程序 (OVA) 模板创建应用程序的足够权限。You have insufficient permissions to create an application in Azure Active Directory (AAD) using the Open Virtualization Application (OVA) template.

若要解决问题,请登录 Azure 门户并执行以下操作之一:To resolve the issue, sign in to the Azure portal and do one of the following:

进程服务器/主目标无法与配置服务器通信Process server/Master Target are unable to communicate with the configuration server

进程服务器 (PS) 和主目标 (MT) 模块无法与配置服务器 (CS) 通信,并且它们的状态在 Azure 门户上显示为未连接。The process server (PS) and Master Target (MT) modules are unable to communicate with the configuration server (CS) and their status is shown as not connected on Azure portal.

这通常是由于端口 443 出错。Typically, this is due to an error with port 443. 使用以下步骤取消阻止该端口并重新启用与 CS 的通信。Use the following steps to unblock the port and re-enable communication with the CS.

验证并确保 MARS 代理正在被主目标代理调用Verify that the MARS agent is being invoked by the Master Target agent

若要验证并确保主目标代理可以为配置服务器 IP 创建 TCP 会话,请在主目标代理日志中查找类似于以下内容的跟踪:To verify that the Master Target Agent can create a TCP session for the Configuration server IP, look for a trace similar to the following in the Master Target agent logs:

TCP <Replace IP with CS IP here>:52739 <Replace IP with CS IP here>:443 SYN_SENTTCP <Replace IP with CS IP here>:52739 <Replace IP with CS IP here>:443 SYN_SENT

TCP 192.168.1.40:52739 192.168.1.40:443 SYN_SENT // 此处将 IP 替换为 CS IPTCP 192.168.1.40:52739 192.168.1.40:443 SYN_SENT // Replace IP with CS IP here

如果在 MT 代理日志中发现类似于以下内容的跟踪,则 MT 代理将报告端口 443 出错:If you find traces similar to the following in the MT agent logs, the MT Agent is reporting errors on port 443:

#~> (11-20-2018 20:31:51):   ERROR  2508 8408 313 FAILED : PostToSVServer with error [at curlwrapper.cpp:CurlWrapper::processCurlResponse:212]   failed to post request: (7) - Couldn't connect to server
#~> (11-20-2018 20:31:54):   ERROR  2508 8408 314 FAILED : PostToSVServer with error [at curlwrapper.cpp:CurlWrapper::processCurlResponse:212]   failed to post request: (7) - Couldn't connect to server

如果其他应用程序也在使用端口 443,或由于阻止端口的防火墙设置,可能会遇到此错误。This error can be encountered when other applications are also using port 443 or due to a firewall setting blocking the port.

若要解决问题,请执行以下操作:To resolve the issue:

  • 验证并确保防火墙未阻止端口 443。Verify that port 443 is not blocked by your firewall.
  • 如果由于其他应用程序使用该端口而导致端口无法访问,请停止并卸载该应用。If the port is unreachable due to another application using that port, stop and uninstall the app.
    • 如果停止应用不可行,请设置新的干净 CS。If stopping the app is not feasible, setup a new clean CS.
  • 重启配置服务器。Restart the configuration server.
  • 重启 IIS 服务。Restart the IIS service.

配置服务器因 UUID 条目不正确而没有连接Configuration server is not connected due to incorrect UUID entries

当数据库中有多个配置服务器 (CS) 实例 UUID 条目时,可能会发生此错误。This error can occur when there are multiple configuration server (CS) instance UUID entries in the database. 此问题经常在克隆配置服务器 VM 时发生。The issue often occurs when you clone the configuration server VM.

若要解决问题,请执行以下操作:To resolve the issue:

  1. 从 vCenter 中删除过时/陈旧的 CS VM。Remove stale/old CS VM from vCenter. 有关详细信息,请参阅删除服务器并禁用保护For more information, see Remove servers and disable protection.

  2. 登录配置服务器 VM 并连接到 MySQL svsdb1 数据库。Sign in to the configuration server VM and connect to the MySQL svsdb1 database.

  3. 执行以下查询:Execute the following query:

    重要

    验证并确保输入的是克隆配置服务器的 UUID 详细信息,或不再用于保护虚拟机的配置服务器过时条目。Verify that you are entering the UUID details of the cloned configuration server or the stale entry of the configuration server that is no longer used to protect virtual machines. 输入不正确的 UUID 将导致丢失所有现有受保护项的信息。Entering an incorrect UUID will result in losing the information for all existing protected items.

        MySQL> use svsdb1;
        MySQL> delete from infrastructurevms where infrastructurevmid='<Stale CS VM UUID>';
        MySQL> commit; 
    
  4. 刷新门户页面。Refresh the portal page.

输入凭据时会出现无限登录循环An infinite sign in loop occurs when entering your credentials

在配置服务器 OVF 上输入正确的用户名和密码后,Azure 登录将继续提示输入正确的凭据。After entering the correct username and password on the configuration server OVF, Azure sign in continues to prompt for the correct credentials.

系统时间不正确时可能会发生此问题。This issue can occur when the system time is incorrect.

若要解决问题,请执行以下操作:To resolve the issue:

在计算机上设置正确的时间并重试登录。Set the correct time on the computer and retry the sign in.